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Field of the Invention 

The present invention relates to toxins isolated from 
bacteria and the use of said toxins as insecticides. 

Background of the Invention 



Many insects are widely regarded as pests to homeowners, to 
picnickers « to gardeners, and to farmers and others whose 

25 investments in agricultural products are often destroyed or 
diminished as a result of insect damage to field crops. 
Particularly in areas where the growing season is short, 
significant insect damage can mean the loss of all profits to 
growers and a dramatic decrease in crop yield. Scarce supply of 

30 particular agricultural products invariably results in higher 

costs to food processors and. then, to the ultimate consumers of 
food plants and products derived from those plants. 

Preventing insect damage to crops and flowers and 
eliminating the nuisance of insect pests have typically relied on 

35 strong organic pesticides and insecticides with broad toxicities. 
These synthetic products have come under attack by the general 
population as being too harsh on the environment and on those 
exposed to such agents. Similarly in non-agricultural settings, 
homeowners would be satisfied to have insects avoid their homes 

40 or outdoor meals without needing to kill the insects. 
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The extensive use of chemical insecticides has raised 
environmental and health concerns for farmers, companies that 
produce the insecticides, government agencies, public interest 
groups, and the public in general. The development of less 
5 intrusive pest management strategies has been spurred along both 
by societal concern for the environment and by the development of 
biological tools which exploit mechanisms of insect management. 
Biological control agents present a promising alternative to 
chemical insecticides . 

10 Organisms at every evolutionary development level have 

devised means to enhance their own success and survival. The use 
of biological molecules as tools of defense and aggression is 
Jcnown throughout the animal and plant kingdoms. In addition, the 
relatively new tools of the genetic engineer allow modifications 

15 to biological insecticides to accon^lish particular solutions to 
particular problems. 

One such agent, Bacillus thuringiensis (St), is an effective 
insect icidal agent, and is widely commercially used as such. In 
fact, the insect icidal agent of the Bt bacterium is a protein 

20 which has such limited toxicity, it can be used on human food 
crops on the day of harvest. To non-targeted organisms, the Bt 
toxin* is a digestible non-toxic protein. 

Another known class of biological insect control agents are 
certain genera of nematodes known to be vectors of transmission 

25 for insect -killing bacterial symbionts. Nematodes containing 
insect icidal bacteria invade insect larvae. The bacteria then 
kill the larvae. The nematodes reproduce in the larval cadaver. 
The nematode progeny then eat the cadaver from within. The 
bacteria-containing nematode progeny thus produced can then 

30 invade additional larvae. 

In the past, insecticidal nematodes in the Sceinernema and 
Hecerorhabditis genera were used as insect control agents. 
Apparently, each genus of nematode hosts a particular species of 
bacterium. in nematodes of the Heterorhabditis genus, the 
35 symbiotic bacterium is Photorhabdus luminescens. 

Although these nematodes are effective insect control 
agents, it is presently difficult, expensive, and inefficient to 
produce, maintain, and distribute nematodes for insect control. 

It has been known in the art that one may isolate an 
insecticidal toxin from Photorhabdus luminescens that has 
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activity only when injected into Lepidopteran and Coieopteran 
insect larvae. This has made it impossible to effectively 
exploit the insecticidal properties of the nematode or its 
bacterial symbiont. what would be useful would be a more 
5 practical, less labor-intensive wide-area delivery method of an 
insecticidal toxin which would retain its biological properties 
after delivery. It would be quite desirous to discover toxins 
with oral activity produced by the genus Photorhabdus. The 
isolation and use of these toxins are desirous due to efficacious 
reasons. Until applicants' discoveries, these toxins had not 
been isolated or characterized. 
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Summary of the Invention 

The native toxins are protein complexes that are produced 
and secreted by growing bacteria cells of the genus Photorhabdus. 
of interest are the proteins produced by the species Phocorhabdus 
luminescens. The protein complexes, with a molecular size of 
approximately 1.000 kDa. can be separated by SDS-PAGE gel 
analysis into numerous component proteins. The toxins contain no 
hemolysin, lipase, type C phospholipase. or nuclease activities. 
The toxins exhibit significant toxicity upon exposure 
administration to a number of insects. 

The present invention provides an easily administered 
insecticidal protein as well as the expression of toxin in a 
heterologous system. 

The present invention also provides a method for delivering 
insecticidal toxins that are functional active and effective 
against many orders of insects. 

Objects, advantages, and features of the present invention 

r 

Will become apparent from the following specification. 

Brief Description of the Drawings 

Fig. 1 is an illustration of a match of cloned DNA isolates 
used as a part of sequence genes for the toxin of the present 
invention. 

Fig. 2 is a map of three plasmids used in the sequencing 
process. 
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Fig. 3 is a map iiiuscrating the inter-reiationship of 
severai partial dna fragments. 

Fig. 4 is an illustration of a homology analysis between the 
protein sequences of TcbAn and TcaBii proteins. 

5 Fig. 5 is a phencgram of Phocorhabdus strains. Relationship 

of Photorhabdus Strains was defined by rep-PCR. 
The upper axis of Fig. 5 measures the percentage similarity of 
strains based on scoring of rep-PCR products (i.e., 0.0 (no 
similarity] to 1.0 (100% similarity]). At the right axis, the 
numbers and letters indicate the various strains tested; I4=w-14 
Hm.Hm, H9=H9, 7=WX.7. 1=WX-1, 2=WX-2. 88.HP88, NC-i=NC-l, 4=WX-4, 
9=WX-9, 8.WX-8. lO.WX-10, WIR.WIR. 3«Va-3. il,WX-ll, 5=WX-5. 
6=WX-6, 12=WX-12. xl4=WX-14. 15=WX-15. Hb=Hb. B2=B2. 48 through 
52=ATCC 43948 through ATCC 43952. Vertical lines separating 
horizontal lines indicate the degree of relatedness (as read from 
the extrapolated intersection of the vertical line with the upper 
axis) between strains or groups of strains at the base of the 
horizontal lines (e.g.. strain W-14 is approximately 60% similar 
to strains H9 and Hm) . 
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Fig. 6 is an illustration of the genomic maps of the w-i4 
Strain. 



Detailed De scription of the Invention 

The present inventions are directed to the discovery c £ a 
unique class of insecticidal protein toxins from the genus 
Phocorhabdus that have oral toxicity against insects. A unique 
feature of Phocorhabdus is its bioluminescence. Photorhabdus may 
be isolated from a variety of sources. One such source is 
nematodes, more particularly nematodes of the genus 
Hecerorhabdicis. Another such source is from human clinical 
samples from wounds, see Farmer et ai. 1989 J. Clin. Microbiol. 
27 pp. 1594-1600. These saprohytic strains are deposited in the 
American Type Culture Collection (Rockville. MD> ATCC #s 43948, 
43949. 43950. 43951. and 43952. and are incorporated herein by 
reference, it is possible that other sources could harbor 
Phocorhabdus bacteria that produce insecticidal toxins. Such 

sources in the environment could be either terrestrial or aquatic 
based. 
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The genus Phocorhabdus is taxonomicaiiy defined as a membvi. 
of Che Family Encerobacceriaceae, although ic has certain traic. 
atypical of this family. For example, strains of this genus are 
nitrate reduction negative, yellow and red pigment producing and 
5 bioluminescent. This latter trait is otherwise unknown within 
the Enterobacceriaceae. Phocorhabdus has only recently been 
described as a genus separate from the Xenorhabdus (Boemare 
et ai., 1993 Int. J. Syst. Bacteriol. 43, 249-255). This 
differentiation is based on DIIA-DNA hybridization studies, 
phenotypic differences (e.g.. presence (Phocorhabdus) or absence 
(Xenorhabdus) of cataiase and bioluminescence) and the Family of 
the nematode host [Xenorhabdus; Sceinernemacidae, Phocorhabdus: 
Hecerorhabdicldae) . Comparative, cellular fatty-acid analyses 
(Janse et al. 1990, Lett. Appi. Microbiol 10, 131-135; Suzuki 
et al. 1990, J. Gen. Appl. Microbiol., 36, 393-401) support the 
separation of Phocorhabdus from XenorhaJbdus. 

In order to establish that the strain collection disclosed 
herein was comprised of Phocorhabdus strains, the strains were 
characterized based on recognized traits which define 
20 Phocorhabdus and differentiate it from other Encerobacceriaceae 
and Xenorhabdus species. (Farmer, 1984 Sergey's Manual of 
Systemic Bacteriology Vol. 1 pp. 510-511; Akhurst and Boemare 1988, 
J. Gen. Microbiol. 134 pp. 1835-1845; Boemare et al, 1993 Int. J. 
Syst. Bacteriol. 43 pp. 249-255, which are incorporated herein by 
reference). The traits studied were the following: gram stain 
negative rods, organism size, colony pigmentation, inclusion 
bodies, presence of cataiase, ability to reduce nitrate, 
bioluminescence, dye uptake, gelatin hydrolysis, growth on 
selective media, growth temperature, survival under anerobic 
30 conditions and motility. Fatty acid analysis was used to confirm 
that the strains herein all belong to the single genus 
Phocorhabdus . 

Currently, the bacterial genus Phocorhabdus is comprised of 
a single defined species, Phocorhabdus luminescens (ATCC Type 
strain #29999, Poinar et al., 1977, Nematologica 23, 97-102). A 
variety of related strains have been described in the literature 
(e.g. Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; 
Boemare et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz 
ec al. 1990, Appl. Environ. Microbiol., 56, 181-186). Numerous 
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Phoccrhatdus scrains have been characterized herein. Such 
strains are listed in Table 18 in the Examples. Because there is 
currently only one species ( iu;ninescens» defined within the genus 
Phocorhabdus. the luminescens species traits were used to 
5 characterize the strains herein. As can be seen in Fig. 5, these 
strains are quite diverse. It is not unforeseen that in the 
future there may be other Phocorhabdus species that will hav- 
some of the attributes of the luminescens species as well assome 
different characteristics that are presently not defined as a 
crait of Phocorhabdus luminescens. However, the scope of the 
invention herein is to any Phocorhabdus species or strains which 
produce proteins that have functional activity as insect control 
agents, regardless of other traits and characteristics. 

Furthermore, as is demonstrated herein, the bacteria of the 
genus Phocorhabdus produce proteins that have functional activity 
as defined herein. Of particular interest are proteins produced 
by the species Phocorhabdus i uniinescens . The inventions herein 
should in no way be limited to the strains which are disclosed 
herein. These strains illustrate for the first time that 
proteins produced by diverse isolates of Phocorhabdus are toxic 
upon exposure to insects. Thus, included within the inventions 
described herein are the strains specified herein and any mutants 
thereof, as well as any strains or species of the genus 
Phocorhabdus that have the functional activity described herein. 

There are several terms that are used herein that have a 
particular meaning and are as follows: 
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By -functional activity it is meant herein that the protein 
toxins function as insect control agents in that the proteins are 
orally active, or have a toxic effect, or are able to disrupt or 
deter feeding, which may or may not cause death of the insect. 
When an insect comes into contact with an effective amount of 
toxin delivered via transgenic plant expression, formulated 
protein compositions (s) , sprayable protein composition (s) . a bait 
matrix or other delivery system, the results are typically death 
of the insect, or the insects do not feed upon the source which 
makes the toxins available to the insects. 
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The protein toxins discussed herein are typically referred to as 
•insecticides'. By insecticides it is meant herein that the 
protein toxins have a 'functional activity as further defined 
herein and are used as insect control agents. 

5 

By the use of the term -oligonucleotides' it is meant a 
macromolecule consisting of a short chain of nucleotides of 
either RNA or DNA. Such length could be at least one nucleotide, 
but typically are in the range of about 10 to about 12 
nucleotides. The determination of the length of the 
oligonucleotide is well within the skill of an artisan and should 
not be a limitation herein. Therefore, oligonucleotides may be 
less than 10 or greater than 12. 

15 By the use of the term 'toxic or 'toxicity as used herein it is 
meant that the toxins produced by Phocorhabdus have 'functional 
activity as defined herein. 
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By the use of the term 'genetic material- herein, it is meant to 
include all genes, nucleic acid, DNA and RNA. 



Fermentation broths from selected strains reported in 
Table 18 were used to determine the following: breadth of 
insecticidal toxin production by the Phocorhabdus genus, the 
25 insecticidal spectrum of these toxins, and to provide source 
material to purify the toxin con^lexes. The strains 
characterized herein have been shown to have oral toxicity 
against a variety of insect orders. Such insect orders include 
but are not limited to Coleopcera. Homopcera, Lepidopcera. 
Dipcera, Acarina, Hymenopcera and Diccyopcera. 

As with other bacterial toxins, the rate of mutation of the 
bacteria in a population causes many related toxins slightly 
different in sequence to exist. Toxins of interest here are 
those which produce protein complexes toxic to a variety of 
insects upon exposure, as described herein. Preferably, the 
toxins are active against Lepidopcera. Coleopcera. Homopocera. 
Dipcera. Hymenopcera. Diccyopcera and Acarina. The inventions 
herein are intended to capture the protein toxins homologous to 
protein toxins produced by the strains herein and any derivative 
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By the use of the term "PhotorhaJbdus toxin** it is meant any 
protein produced by a Photorhahdus microorganism strain 
which has functional activity against insects, where the 
PhotorhabdUB toxin could be formulated as a sprayable 
composition, expressed by a transgenic plant, formulated as 
a bait matrix, delivered via a Baculovixnis, or delivered by 
any other applicable host or delivery system. 
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Strains thereof, as well as any protein toxins produced by 
Phocorhabdus. These homologous proteins may differ in sequence, 
but do not differ in function from those toxins described herein. 
Homologous toxins are meant to include protein complexes of 
S between 300 kDa to 2,000 kDa and are comprised of at least tv. * 
(2) subunits, where a subunit is a peptide which may or may not 
be the same as the other subunit. Various protein subunits have 
been identified and are taught in the Examples herein. 
Typically, the protein subunits are between about 18 kDa to about 
10 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 
kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about 80 
kDa. 

As discussed above, some Phocorhabdus strains can be 
isolated from nematodes. Some nematodes, elongated cylindrical 
15 parasitic worms of the phylum Nemacoda, have evolved an ability 
to exploit insect larvae as a favored growth environment. The 
insect larvae provide a source of food for growing nematodes and 
an environment in which to reproduce. One dramatic effect that 
follows invasion of larvae by certain nematodes is larval death. 

20 Larval death results from the presence of, in certain nematodes, 
bacteria that produce an insecticidal toxin which arrests larval 
growth and inhibits feeding activity. 

Interestingly, it appears that each genus of insect 
parasitic nematode hosts a particular species of bacterium, 

25 uniquely adapted for symbiotic growth with that nematode. In the 
interim since this research was initiated, the ncune of the 
bacterial genus Xenorhabdus was reclassified into the Xenorhabdus 
and the Photorhabdus , Bacteria of the genus Photorhabdus are 
characterized as being symbionts of Mecerorhabdicus nematodes 

30 while Xenorhabdus species are symbionts of the Steinernema 
species. This change in nomenclature is reflected in this 
specification, but in no way should a change in nomenclature 
alter the scope of the inventions described herein. 

The peptides and genes that are disclosed herein aie named 

35 according to the guidelines recently published in the Journil of 
Bacteriology 'Instructions to Authors' p. i-xii (Jan. 1996), 
which is incorporated herein by reference. The following 
peptides and genes were isolated from Photorhabdus strain w-i4. 
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Pepcids / Gene Nomenclature 
Toxin complex (Tc) 



Peptide 
Name 

tea genomic region 

Tea A 

TcaAiii 

10 TcaBi 

TcaBii 

TcaC 

ccb genomic region 
1 5 TcbA 

TcbAi 
TcbAii 
TcbAi ii 

20 ccc genomic region 
TccA 
TccB 



ccd genomic region 
TcdAi 

TcdAii 

TcdAi ii 
TcdB 



Gene 

Name 



ccaA 
tea A 
ccaB 



ccaC 



tcbA 
ccbA 

cchA 

tcbA 



tCCA 
tCCB 



ccdA 
ccdA 

tcdA 
tcdB 



Patent 
Sequence ID# 



12 
4 

3 (19, 

5 

2 



20) 



16 

(pro-peptide) 
1 (21. 22, 23. 
40 



8 

7 



(pro-peptide) 

13, (38, 39 

17, 18) 

41, (42, 43) 

14 



24) 



(bracket sequence indicates internal amino acid sequence obtained 
by tryptic digests) 



35 The sequences listed above are grouped by genomic region. 

The CcbA gene was expressed in E, coli as two protein fragments 
TcbA and TcbAiii as illustrated in the Examples. It may be 
beneficial to have proteolytic clippage of some sequences to 
obtain the higher activity of the toxins for commercial 

40 transgenic applications. 

The toxins described herein are quite unique in that the 
toxins have functional activity, which is key to developing an 
insect management strategy. In developing an insect management 

45 strategy, it is possible to delay or circumvent the protein 
degradation process by injecting a protein directly into an 
organism, avoiding its digestive tract. In such cases, the 
protein administered to the organism will retain its function 
until it is denatured, non-specif ically degraded, or eliminated 

50 by the immune system in higher organisms. Injection into insects 
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of an inseccicidal coxin has potential application only in the 
laboratory, and then only on large insects which are easily 
injected. The observation that the insect icidal protein toxins 
herein described exhibits their toxic activity after oral 
S ingestion or contact with the toxins permits the development o£ 
an insect management plan based solely on the ability to 
incorporate the protein toxins into the insect diet. Such a plan 
could result in the production o£ insect baits. 

The Phocorhatdus toxins may be administered to insects in a 
iO purified form. The toxins may also be delivered in amounts from 
about 1 to about 100 mg / liter of broth. This may vary upon 
formulation condition, conditions of the inoculum source, 
techniques for isolation of the toxin, and the like. The toxins 
may be administered as an exudate secretion or cellular protein 
15 originally expressed in a heterologous prokaryotic or eukaryotic 
host. Bacteria are typically the hosts in which proteins are 
expressed. Eukaryotic hosts could include but are not limited to 
plants, insects and yeast. Alternatively, the toxins may be 
produced in bacteria or transgenic plants in the field or in the 
20 insect by a bacuiovirus vector. Typically the toxins will be 
introduced to the insect by incorporating one or more of the 
toxins into the insects' feed. 

Complete lethality to feeding insects is useful but is not 
required to achieve useful toxicity. If the insects avoid the 
25 toxin or cease feeding, that avoidance will be useful in some 

applications, even if the effects are sublethal. For example, if 
insect resistant transgenic crop plants are desired, a reluctance 
of insects to feed on the plants is as useful as lethal toxicity 
to the insects since the ultimate objective is protection of the 
30 plants rather than killing the insect. 

There are many other ways in which toxins can be 
incorporated into an insect's diet. As an example, it is 
possible to adulterate the larval food source with the toxic 
protein by spraying the food with a protein solution, as 
35 disclosed herein. Alternatively, the purified protein could be 
genetically engineered into an otherwise harmless bacterium, 
which could then be grown in culture, and either applied to the 
food source or allowed to reside in the soil in an area in which 
insect eradication was desirable. Also, the protein could be 
40 genetically engineered directly into an insect food source. For 
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instance, Che major food source of many insect larvae is plant 
material . 

By incorporating genetic material that encodes the 
insecticidal properties of the Phocorhabdus toxins into the 
5 genome of a plant eaten by a particular insect pest, the adult or 
larvae would die after consuming the food plant. Numerous 
members of the monocotyledonous and dictyledenous genera have 
been transformed. Transgenic agronmonic crops as well as fruits 
and vegetables are of commercial interest. Such crops include 
10 but are not limited to maize, rice, soybeans, canola. sunflower, 
alfalfa, sorghum, wheat, cotton, peanuts, tomatoes, potatoes, and 
the like. Several techniques exist for introducing foreign 
genetic material into plant cells, and for obtaining plants that 
stably maintain and express the introduced gene. Such techniques 
15 include acceleration of genetic material coated onto 

microparticles directly into cells(U.S. Patents 4.945.050 to 
Cornell and 5.141,131 to DowElanco) . Plants may be transformed 
using Agrobaccerium technology, see U.S. Patent 5,177.010 to 
University of Toledo. 5.104.310 to Texas A&M. European Patent 
Application 0131624B1. European Patent Applications 120516, 
159418B1 and 176.112 to Schilperoot, U.S. Patents 5.149.645. 
5,469.976. 5.464.763 and 4.940.838 and 4,693.976 to Schilperoot. 
European Patent Applications 116718, 290799, 320500 all to 
MaxPlanck, European Patent Applications 604662 and 627752 to 
Japan Tobacco. European Patent Applications 0267159, and 0292435 
and U.S. Patent 5,231.019 all to Ciba Geigy, U.S. Patents 
5.463.174 and 4.762.785 both to Calgene, and U.S. Patents 
5,004.863 and 5.159.135 both to Agracetus. Other transformation 
technology includes whiskers technology, see U.S. Patents 
30 5.302.523 and 5,464.765 both to Zeneca. Elect roporat ion 
technology has also been used to transform plants, see WO 
87/06614 to Boyce Thompson Institute. 5,472,869 and 5.384.253 
both to Dekalb. WO9209696 and W09321335 both to PGS. All of 
these transformation patents and publications are incorporated by 
reference. In addition to numerous technologies for transforming 
plants, the type of tissue which is contacted with the foreign 
genes may vary as well. Such tissue would include but would not 
be limited to embryogenic tissue, callus tissue type I and II, 
hypocotyl. meristem, and the like. Almost all plant tissues may 
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be transformed during dedif ferenciacion using appropriace 
techniques within the skill of an artisan. 

Another variable is the choice of a selectable marker. The 
preference for a particular marker is at the discretion of the 
5 artisan, but any of the following selectable markers may be used 
along with any other gene not listed herein which could function 
as a selectable marker. Such selectable markers include but are 
not limited to aminoglycoside phosphotransferase gene of 
transposon TnS (Aph II) which encodes resistance to the 
antibiotics kaaamycin, neomycin and G418, as well as those genes 
which code for resistance or tolerance to giyphosate; hygromycin; 
methotrexate; phosphinothricin (bialophosJ ; imidazolinones. 
sulfonylureas and triazolopyrimidine herbicides, such as 
chlorosulfuron; bromoxynil, daiapon and the like. 

In addition to a sele<;table marker, it may be desirous to 
use a reporter gene. In some instances a reporter gene may be 
used without a selectable marker. Reporter genes are genes which 
are typically not present or expressed in the recipient organism 
or tissue. The reporter gene typically encodes for a protein 
which provides for some phenotypic change or enzymatic property. 
Examples of such genes are provided in K. Weising et ai. Ann. 
Rev. Genetics. 22. 421 (1988). which is incorporated herein by 

reference. A preferred reporter gene is the glucuronidase (GUS) 
gene . 

25 Regardless of transformation technique, the gene is 

preferably incorporated into a gene transfer vector adapted to 
express the Phocorhabdus toxins in the plant cell by including in 
the vector a plant promoter. In addition to plant promoters, 
promoters from a variety of sources can be used efficiently in 
plant cells to express foreign genes. For example, promoters of 
bacterial origin, such as the octopine synthase promoter, the 
nopaline synthase promoter, the mannopine synthase promoter; 
promoters of viral origin, such as the cauliflower mosaic virus 
(35S and 19S)and the like may be used. Plant promoters include, 
but are not limited to ribulose-1. S-bisphosphate (RUBP) 
carboxylase small subunit (ssu) . beta-conglycinin promoter, 
phaseolin promoter. ADH promoter, heat-shock promoters and tissue 
specific promoters. Promoters may also contain certain enhancer 
sequence elements that may improve the transcription efficiency. 
Typical enhancers include but are not limited to Adh- intron 1 and 
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Adh-mcron 6. Constitutive promoters may be used. Constitutive 
promoters direct continuous gene expression in all cells types 
and at all times (e.g., actin, ubiquitin, CaMV 3SS) . Tissue 
specific promoters are responsible for gene expression in 
5 specific cell or tissue types, such as the leaves or seeds (e.g., 
zein, oleosin. napin, AGP) and these promoters may also be used. 
Promoters may also be are active during a certain stage of the 
plants' development as well as active in plant tissues and 
organs. Examples of such promoters include but are not limited 
10 to pollen-specific, embryo specific^ corn silk specific, cotton 
fiber specific, root specific, seed endosperm specific promoters 
and the like. 

Under certain circumstances it may be desirable to use an 
inducible promoter. An inducible promoter is responsible for 

IS expression of genes in response to a specific signal, such as: 
physical stimulus (heat shock genes); light (RUBP carboxylase); 
hormone (Em); metabolites; and stress. Other desirable 
transcription and translation elements that function in plants 
may be used. Numerous plant-specific gene transfer vectors are 

20 known to the art. 

In addition, it is known that to obtain high expression of 
bacterial genes in plants it is preferred to reengineer the 
bacterial genes so that they are more efficiently expressed in 
the cytoplasm of plants. Maize is one such plant where it is 

25 preferred to reengineer the bacterial gene(s) prior to 

transformation to increase the expression level of the toxin in 
the plant. One reason for the reengineering is the very low G^C 
content of the native bacterial gene(s) (and consequent skewing 
towards high k^T content). This results in the generation of 

30 sequences mimicking or duplicating plant gene control sequences 
that are known to be highly A-^-T rich. The presence of some A^-t- 
rich sequences within the DNA of the gene(s) introduced into 
plants (e.g.. TATA box regions normally found in gene promoters) 
may result in aberrant transcription of the gene(s). On the 

35 other hand, the presence of other regulatory sequences residing 
in the transcribed mRNA (e.g., polyadenylation signal sequences 
(AAUAAA) . or sequences complementary to small nuclear RNAs 
involved in pre-mRNA splicing) may lead to RKA instability. 
Therefore, one goal in the design of reengineered bacterial 
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gene(s*, more preferably referred co as plant optimized geneisi. 
is to generate a DllA sequence having a higher G*C concent, and 
preferably one close to that of plant genes coding for metabolic 
enzymes. Another goal in the design of the plant optimized 
5 gene(s) is co generate a DNA sequence that not only has a higher 
G+C content, but by modifying the sequence changes, should be 
made so as to not hinder translation. 

An example of a plant that has a high G+C concent is maize. 
The table below illustrates how high the G*C content is in maize. 
10 As in maize, it is thought that G*C content in other plants is 
also high. 



15 



Table 1 

Compilation of G^c contents of protein coding regions 

of maize genes 




Number of genes in class given in parentheses. 
^ standard deviations given in parentheses. 

Combined groups mean ignored in calculation of 
overall mean. 

20 For the data in Table 1. coding regions of the genes were 
extracted from GenBank (Release 71} entries, and base 
compositions were calculated using the MacVector^ program (IBI. 
New Haven, CT) , Intron sequences were ignored in the 
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--aicuiacions. Group i and II scorage protein gene seq^aences were 
distinguished by their marked difference in base composition. 

Due CO the plasticity afforded by the redundancy of the 
genetic code (i.e.. some amino acids are specified by more than 
5 one codon). evolution of the genomes of different organisms or 
classes or organisms has resulted in differential usage of 
redundant codons. This -codon bias- is reflected in the mean base 
composition of protein coding regions. For example, organisms 
with relatively low G>C contents utilize codons having A or T in 
the third position of redundant codons. whereas those having 
higher G*c contents utilize codons having G or C in the third 
position. It is thought that the presence of 'minor- codons 
within a gene's mRNA may reduce the. absolute translation rate of 
that mWJA. especially when the relative abundance of the charged 
tRNA corresponding to the minor codon is low. An extension of 
this is that the diminution of translation rate by individual 
minor codons would be at least additive for multiple minor 
codons. Therefore. mWlAs having high relative contents of minor 
codons would have correspondingly low translation rates. This 
rate would be reflected by the synthesis of low levels of the 
encoded protein. 

In order to reengineer the bacterial gene(s), the codon bias 
of the plant is determined. The codon bias is the statistical 
codon distribution that the plant uses for coding its proteins. 
After determining the bias, the percent frequency of the codons 
in the gene(s) of interest is determined. The primary codons 
preferred by the plant should be determined as well as the second 
and third choice of preferred codons. The amino acid sequence of 
the protein of interest is reverse translated so that the 
resulting nucleic acid sequence codes for the same protein as the 
native bacterial gene, but the resulting nucleic acid sequence 
corresponds to the first preferred codons of the desired plant. 
The new sequence is analyzed for restriction enzyme sites that 
might have been created by the modification. The identified 
sites are further modified by replacing the codons with second or 
third choice preferred codons. other sites in the sequence which 
could affect the transcription or translation of the gene of 
interest are the exon:intron 5' or 3' junctions, poly A addition 
signals, or rna polymerase termination signals. The sequence is 
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furcher analyzed and modified to reduce the freq^iency of TA or 
doublets. In addition to the doublets, G or C sequence blocks 
that have more than about four residues that are the same can 
affect transcription of the sequence. Therefore, these blocks 
5 are also modified by replacing the codons of first or second 
choice, etc. with the next preferred codon of choice, it is 
preferred that the plant optimized gene(s) contains about 63% of 
first choice codons. between about 22% to about 37% second choice 
codons, and between 15% and 0% third choice codons, wherein the 

10 total percentage is 100%. Most preferred the plant optimized 
gene(s) contain about 63% of first choice codons, at least about 
22% second choice codons, about 7.5% third choice codons, and 
about 7.5% fourth choice codons, wherein the total percentage is 
100%. The method described above enables one skilled in the art 

15 to modify gene(s) that are foreign to a particular plant so that 
the genes are optimally expressed in plants. The method is 
further illustrated in pending provisional application u.S, 
60/005,405 filed on October 13, 1995, which is incorporated 
herein by reference. 

20 Thus, in order to design plant optimized gene(s) the amino 

acid .sequence of the toxins are reverse translated into a DNA 
sequence, utilizing a nonredundant genetic code established from 
a codon bias table compiled for the gene DNA sequence for the 
particular plant being transformed. The resulting DNA sequence, 
25 which is completely homogeneous in codon usage, is further 
modified to establish a DNA sequence that, besides having a 
higher degree of codon diversity, also contains strategically 
placed restriction enzyme recognition sites, desirable base 
composition, and a lack of sequences that might interfere with 
transcription of the gene, or translation of the product mRNA. 



30 



It is theorized that bacterial genes, may be more easily 
expressed in plants if the bacterial genes are e.xpressed in the 
plastids. Thus, it may be possible to express bacterial genes in 
plants, without optimizing the genes for plant expression, and 
35 obtain high express of the protein. See U.S. Patent Nos. 

4.762,785; 5.451.513 and 5.545.817, which are incorporated herein 
by reference. 
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Oae of Che issues regarting commercial exploiting transgenic 
plants is resistance management. This is of particular concern 
with BaciiJus churingiensis toxins. There are numerous companies 
commerically exploiting Bacillus thuringiensis and there has been 
much concern about Sc toxins becoming resistant. One strataegy 
for insect resistant management would be to combine the toxins 
produced by Photorhabdus with toxins such as Be, vegetative 
insect proteins (Ciba Geigy) or other toxins. The combinations 
could be formulated for a sprayable application or could be 
molecular combinations. Plants could be transformed with 
Photorhabdus genes that produce insect toxins and other insect 
toxin genes such as St as with other insect toxin genes such as 
fit. 

European Patent Application 0400246A1 describes 
transformation of 2 flt in a plant, which could be any 2 genes. 
Another way to produce a transgenic plant that contains more than 
one insect resistant gene would be to produce two plants, with 
each plant containing an insect resistant gene. ' These plants 
would be backcrossed using traditional plant breeding techniques 
to produce a plant containing more than one insect resistant 
gene. 

In addition to producing a transformed plant containing 
plant optimized gene(s), there are other delivery systems where 
it may be desirable to reengineer the bacterial gene(s). Along 
the same lines, a genetically engineered, easily isolated protein 
toxin fusing together both a molecule attractive to insects as a 
food source and the insecticidal activity of the toxin may be 
engineered and expressed in bacteria or in eukaryotic cells using 
standard, well-known techniques. After purification in the 
laboratory such a toxic agent with -built-in- bait could be 
packaged inside standard insect trap housings. 

Another delivery scheme is the incorporation of the genetic 
material of toxins into a baculovirus vector. Bacuioviruses 
infect particular insect hosts, including those desirably 
targeted with the Photorhabdus toxins. Infectious baculovirus 
harboring an expression construct for the Photorhabdus toxins 
could be introduced into areas of insect infestation to thereby 
intoxicate or poison infected insects. 
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Transfer of the insecticidai properties requires nucleic 
acid sequences encoding the coding the amino acid sequences for 
the Phocorhabdus toxins integrated into a protein expression 
vector appropriate to the host in which the vector will reside. 
S One way to obtain a nucleic acid sequence encoding a protein with 
insecticidai properties is to isolate the native genetic material 
which produces the toxins from Phocorhabdus, using information 
deduced from the toxin's amino acid sequence, large portions of 
which are set forth below. As described below, methods of 
10 purifying the proteins responsible for toxin activity are also 
disclosed. 

Using N-terminal amino acid sequence data, such as set forth 
below, one can construct oligonucleotides complementary to all, 
or a section of, the DNA bases that encode the first amino acids 

IS of the toxin. These oligonucleotides can be radiolabeled and 
used as molecular probes to isolate the genetic material from a 
genomic genetic library built from genetic material isolated from 
strains of Phocorhabdus. The genetic library can be cloned in 
plasmid. cosmid, phage or phagemid vectors. The library could be 

20 transformed into Escherichia coli and screened for toxin 

production by the transformed cells using antibodies raised 
against the toxin or direct assays for insect toxicity. 

This approach requires the production of a battery of 
oligonucleotides, since the degenerate genetic code allows an 
25 amino acid to be encoded in the DNA by any of several three- 
nucleotide combinations. For example, the amino acid arginine 
can be encoded by nucleic acid triplets CGA, CGC. CGG, CGT, AGA. 
and AGG. Since one cannot predict which triplet is used at those 
positions in the toxin gene, one must prepare oligonucleotides 
30 with each potential triplet represented. More than one DNA 

molecule corresponding to a protein subunit may be necessary to 
construct a sufficient number of oligonucleotide probes to 
recover all of the protein subunits necessary to achieve oral 
toxicity. 

35 From the amino acid sequence of the purified protein, 

genetic materials responsible for the production of toxins can 
readily be isolated and cloned, in whole or in part, into an 
expression vector using any of several techniques well-known to 
one skilled in the art of molecular biology. A typical 

40 expression vector is a DNA plasmid, though other transfer means 
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including, but not limited to. cosmids, phagemids and phage are 
also envisioned. In addition to features required or desired tor 
plasmid replication, such as an origin of replication and 
antibiotic resistance or other form of a selectable marker such 
5 as the bar gene of Strepcomyces hygroscopicus or 

I'iridochromogenes, protein expression vectors normally 
additionally require an expression cassette which incorporates 
the cis-acting sequences necessary for transcription and 
translation of the gene of interest. The cis-acting sequences 
10 required for expression in prokaryotes differ from those required 
in eukaryotes and plants. 

A eukaryotic expression cassette requires a transcriptional 
promoter upstream (5') to the gene of interest, a transcriptional 
termination region such as a poly-A addition site, and a ribosome 
15 binding site upstream of the gene of interest's first codon. In 
bacterial cells, a useful transcriptional promoter that could be 
included in the vector is the T7 RNA Polymerase-binding promoter. 
Promoters, as previously described herein, are knovm to 
efficiently promote transcription of mRNA. Also upstream from 
20 the gene of interest the vector may include a nucleotide sequence 
encoding a signal sequence known to direct a covalently linked 
protein to a particular compartment of the host cells such as the 
cell surface. 

Insect viruses, or baculoviruses, are known to Infect and 

25 adversely affect certain insects. The affect of the viruses on 
insects is slow, and viruses do not stop the feeding of insects. 
Thus viruses are not viewed as being useful as insect pest 
control agents. Combining the Phocorhabdus toxins genes into a 
baculovirus vector could provide an efficient way of transmitting 

30 the toxins while increasing the lethality of the virus. In 

addition, since different baculoviruses are specific to different 
insects, it may be possible to use a particular toxin to 
selectively target particularly damaging insect pests, A 
particularly useful vector for the toxins genes is the nuclear 

35 polyhedrosis virus. Transfer vectors using this virus have been 
described and are now the vectors of choice for transferring 
foreign genes into insects. The virus-toxin gene recombinant may 
be constructed in an orally transmissible form. Baculoviruses 
normally infect insect victims through the mid-gut intestinal 

40 mucosa. The toxin gene inserted behind a strong viral coat 
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procein promoter would be expressed and should rapidly kill the 
infected insect. 

In addition to an insect virus or baculovirus or transgenic 
plant delivery system for the protein toxins of the present 
5 invention, the proteins may be encapsulated using Bacillus 

churlngiensis encapsulation technology such as but not limited to 
U.S. Patent Nos. 4,695.455; 4,695,462; 4,861,595 which are all 
incorporated herein by reference. Another delivery system for 
the protein toxins of the present invention is formulation of the 
10 protein into a bait matrix, which could then be used in above and 
below ground insect bait stations. Examples of such technology 
include but are not limited to PCT Patent Application WO 
93/23998, which is incorporated herein by reference. 

As is described above, it might become necessary to modify 
15 the sequence encoding the protein when expressing it in a non- 
native host, since the codon preferences of other hosts may 
differ from that of Ptiocorhabdus. In such a case, translation 
may be quite inefficient in a new host unless compensating 
modifications to the coding sequence are made. Additionally, 
20 modifications to the amino acid sequence might be desirable to 
avoid inhibitory cross-reactivity with proteins of the new host, 
or to refine the insecticidal properties of the protein in the 
new host. A genetically modified toxin gene might encode a toxin 
exhibiting, for example, enhanced or reduced toxicity, altered 
25 insect resistance development, altered stability, or modified 
target species specificity. 

In addition to the Phocorhabdus genes encoding the toxins, 
the scope of the present invention is intended to include related 
nucleic acid sequences which encode amino acid biopolymers 
30 homologous to the toxin proteins and which retain the toxic 

effect of the Photorhabdus proteins in insect species after oral 
ingestion. 

For instance, the toxins used in the present invention seem 
to first inhibit larval feeding before death ensues. By 

35 manipulating the nucleic acid sequence of Photorhabdus toxins or 
its controlling sequences, genetic engineers placing the toxin 
gene into plants could modulate its potency or its mode of action 
to. for example, keep the eating- inhibitory activity while 
eliminating the absolute toxicity to the larvae. This change 

40 could permit the transformed plant to survive until harvest 
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without having the unnecessarily dramatic effect on the ecosystem 
of wiping out ail target insects. All such modifications of the 
gene encoding the toxin, or of the protein encoded by the gene, 
are envisioned to fall within the scope of the present invention. 
5 Other envisioned modifications of the nucleic acid include 

the addition of targeting sequences to direct the toxin to 
particular parts of the insect larvae for improving its 
efficiency. 

Strains ATCC 55397. 43948, 43949. 43950, 43951, 43952 have 
10 been deposited in the American Type Culture Collection. 12301 
Parklawn Drive, Rockville, MD 20852 USA. Amino acid and 
nucleotide sequence data for the w-14 native toxin (ATCC 55397) 
is presented below. Isolation of the genomic DMA for the toxins 
from the bacterial hosts is also exemplified herein. 
'5 Standard and molecular biology techniques were followed and 

taught in the specification herein. Additional information may 
be found in Sambrook. J., Fritsch, E. F. . and Maniatis/ T. 
(1989), Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Press, which is incorporated herein by reference. 

20 

The following abbreviations are used throughout the Examples: 
Tris = tris ( hydroxy me t hy 1 ) amino methane; SDS s sodium dodecyl 
sulfate; EDTA s ethy lenediaminetetraacet ic acid, IPTG = 
isopropylthio-B-galactoside, X-gal = 5-bromo-4-chloro-3 -indoy 1-B- 
25 D-galactoside, CTAB = cetyltrimethylaromonium bromide; kbp = 

kilobase pairs; dATP, dCTP, dGTP. dTTP, I = 2 ' -deoxynucleos ide 
5 • -triphosphates of adenine, cytosine. guanine, thymine, and 
inosine, respectively; ATP = adenosine 5* triphosphate. 

30 Example 1 

Purifica tion of toxin from P. luminescens and Demonstration of 
toxicity after oral delivery of purified toxin 

The insect icidal protein toxin of the present invention was 
35 purified from p. luminescens strain w-14, ATCC Accession Number 
55397. Stock cultures of P. luminescens were maintained on petri 
dishes containing 2% Proteose Peptone No. 3 (i.e., PP3, Difco 
Laboratories, Detroit MI) in 1.5% agar, incubated at 25^C and 
transferred weekly. Colonies of the primary form of the bacteria 
40 were inoculated into 200 ml of PP3 broth supplemented with 0.5% 
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polyoxy ethylene sorbitan mono-stearate (Tween 60, Sigma Chemical 
Company, sc. Louis MO) in a one liter flask. The broth cultures 
were grown for 72 hours at 20^C on a rotary shaker. The toxin 
proteins can be recovered from cultures grown in the presence or 
S absence of Tween; however, the absence of Tween can affect the 
form of the bacteria grown and the profile of proteins produced 
by the bacteria. In the absence of Tween, a variant shift occurs 
insofar as the molecular weight of at least one identified toxin 
subunit shifts from about 200 kDa to about 185 kDa. 
10 The 72 hour cultures were centrifuged at 10,000 x g for 30 

minutes to remove cells and debris. The supernatant fraction 
that contained the insect icidal activity was decanted and brought 
to 50 mM KjHPO^ by adding an appropriate volume of 1.0 M K^HPO^. 
The pH was adjusted to 8.6 by adding potassium hydroxide. This 
15 supernatant fraction was then mixed with DEAE-Sephacel (Pharmacia 
LKB Biotechnology) which had been equilibrated with 50 mM K2HPO4. 
The toxic activity was adsorbed to the DEAE resin. This mixture 
was then poured into a 2.6 x 40 cm column and washed with 50 mM 
K1HPO4 at room temperature at a flow rate of 30 ml/hr until the 
20 effluent reached a steady baseline UV absorbance at 280 nm. The 
column was then washed with 150 mM KCl until the effluent again 
reached a steady 280 nm baseline. Finally the column was washed 
with 300 mM KCl and fractions were collected. 

Fractions containing the toxin were pooled and filter 
25 sterilized using a 0,2 micron pore membrane filter. The toxin 
was then concentrated and equilibrated to 100 mM KPO4, pH 6.9, 
using an ultrafiltration membrane with a molecular weight cutoff 
of 100 kDa at 4®C (Centriprep 100, Amicon Division-w.R. Grace and 
Company) . A 3 ml sample of the toxin concentrate was applied to 
30 the top of a 2,6 X 95 cm Sephacryl S-400 HR gel filtration column 
(Pharmacia LKB Biotechnology) . The eluent buffer was 100 mM KPO4, 
pH 6.9, which was run at a flow rate of 17 ml/hr, at 4®C. The 
effluent was monitored at 280 nm. 

Fractions were collected and tested for toxic activity. 
35 Toxicity of chromatographic fractions was examined in a 

biological assay using Manduca sexta larvae. Fractions were 
either applied directly onto the insect diet (Gypsy moth wheat 
germ diet, ICN Biochemicals Division - ICN Biomedicals, Inc!) or 
administered by intrahemocelic injection of a 5 ul sample through 
40 the first proleg of 4th or 5th instar larva using a 30 gauge 
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needle. The weight of each iarva within a treatment group was 
recorded at 24 hour intervals. Toxicity was presumed if the 
insect ceased feeding and died within several days of consuming 
treated insect diet or if death occurred within 24 hours after 
5 injection of a fraction. 

The toxic fractions were pooled and concentrated using the 
Centriprep-100 and were then analyzed by HPLC using a 7.5 mm x 60 
cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium 
phosphate, pH 6.9 eluent buffer running at 0.4 ml/min. This 
10 analysis revealed the toxin protein to be contained within a 
single sharp peak that eluted from the column with a retention 
time of approximately 33.6 minutes. This retention time 
corresponded to an estimated molecular weight of 1,000 kDa. Peak 
fractions were collected for further purification while fractions 
not containing this protein were discarded. The peak eluted from 
the HPLC absorbs UV light at 218 and 280 nm but did not absorb at 
405 nm. Absorbance at 405 nm was shown to be an attribute of 
xenorhabdin antibiotic compounds. 

Electrophoresis of the pooled peak fractions in a non- 
denaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed 
that two protein complexes are present in the peak. The peak 
material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a 
1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH 7.0 
and 1.9% agarose resolving gel buffered with 200 mM Tris-borate 
at pH 8.3 under standard buffer conditions (anode buffer IM Tris- 
HCl, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine). The 
gels were run at 13 mA constant current at 15«C until the phenol 
red tracking dye reached the end of the gel. Two protein bands 
were visualized in the agarose gels using Coomassie brilliant 
30 blue staining. 

The slower migrating band was referred to as "protein band 
1" and faster migrating band was referred to as "protein band 2." 
The two protein bands were present in approximately equal 
amounts. The Coomassie stained agarose gels were used as a guide 
to precisely excise the two protein bands from unstained portions 
of the gels. The excised pieces containing the protein bands 
were macerated and a small amount of sterile water was added. As 
a control, a portion of the gel that contained no protein was 
also excised and treated in the same manner as the gel pieces 
containing the protein. Protein was recovered from the gel 
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pieces by eiectroeiucion into 100 mM Tris-borate pH 3.3, at loo 
volts (constant voltage) for two hours. Alternatively, proi-^in 
was passively sluted from the gel pieces by adding an equal 
volume of 50 mM Tris-HCl, pH 7.0, to the gel pieces, then 
5 incubating at SO^C for 16 hours. This allowed the protein to 
diffuse from the gel into the buffer, which was then collecteJ. 

Results of insect toxicity tests using HPLC-purif ied toxin 
(33,6 min. peak) and agarose gel purified toxin demonstrated 
toxicity of the extracts. Injection of 1.5 ug of the HPLC 
purified protein kills within 24 hours. Both protein bands i and 
2, recovered from agarose gels by passive elution or 
electroelution. were lethal upon injection. The protein 
concentration estimated for these samples was less than 50 
ng/larva. A comparison of the weight gain and the mortality 
between the groups of larvae injected with protein bands 1 cl 2 
indicate that protein band 1 was more toxic by injection 
delivery. 

When HPLC-purif ied toxin was applied to larval diet at a 
concentration of 7.5 ug/ larva, it caused a halt in larval weight 
gain (24 larvae tested). The larvae begin to feed, but after 
consuming only a very small portion of the toxin treated diet 
they began to show pathological symptoms induced by the toxin and 
the larvae cease feeding. The insect frass became discolore i and 
most larva showed signs of diarrhea. Significant insect 
mortality resulted when several 5 ug toxin doses were applied to 
the diet over a 7-10 day period. 

Agarose- separated protein band 1 significantly inhibiteri 
larval weight gain at a dose of 200 ng/larva. Larvae fed similar 
concentrations of protein band 2 were not inhibited and gained 
weight at the same rate as the control larvae. Twelve iarva« 
were fed eluted protein and 45 larvae were fed protein-containing 
agarose pieces. These two sets of data indicate that protein 
band 1 was orally toxic to Manduca sexta. In this experiment, it 
appeared that protein band 2 was not toxic to Manduca sexta. 

Further analysis of protein bands 1 and 2 by SDS-PAGE under 
denaturing conditions showed that each band was composed of 
several smaller protein subunits. Proteins were visualized by 
Coomassie brilliant blue staining followed by silver staininy to 
achieve maximum sensitivity. 
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The protein subunits in the two bands were very similar. 
Protein band 1 contains 8 protein subunits of 25.1, 56.2, 60.3. 
65.6, 166, 171, 184 and 208 kDa. Protein band 2 had an identical 
profile except that the 25.1. 60,8. and 65.6 kDa proteins were 
not present. The 56.2. 60.8. 65.6. and 184 kDa proteins were 
present in the complex of protein band 1 at approximately equal 
concentrations and represent 80% or more of the total protein 
content of that complex. 

The native HPLC-purif ied toxin was further characterized as 
follows. The toxin was heat labile in that after being heated to 
60OC for 15 minutes it lost its ability to kill or to inhibit 
weight gain when injected or fed to M. sexta larvae. Assays were 
designed to detect lipase, type C phosphoiipase. nuclease or red 
blood cell hemolysis activities and were performed with purified 
toxin. None of these activities were present. Antibiotic zone 
inhibition assays were also done and the purified toxin failed to 
inhibit growth of Gram-negative or -positive bacteria, yeast or 
filamentous fungi, indicating that the toxic is not a xenorhabdin 
antibiotic. 

The native HPLC-purif ied toxin was tested for ability to 
kill insects other than Manduca sexta. Table 2 lists insects 
killed by the HPLC-purif ied P. luminescens toxin in this study. 



25 



Table 2 

Insects Killed by P. luminescens Toxin 



Common Name 

30 Tobacco 
horn worm 

Mealworm 

35 Pharaoh ant 

German 
cockroach 

40 Mosquito 



Order 



Genus and 
species 



Lepidoptera Manduca sexca 



Coleoptera 

Hymenoptera 

Dictyoptera 

Diptera 



Tenebrio moll tor 



Route of 
Delivery 

Oral and 
injected 

Oral 



Monomoriuitt pharoanxs Oral 

Blactella germanica Oral and 

injected 



Aedes aegypti 



Oral 
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The Phocorhabdus lumxnescens utility and toxicity were 
5 further characterized. Phocorhabdus luminescens (strain W-l.i; 
culture broth was produced as follows. The production mediura was 
2% Bacto Proteose Peptone' Number 3 (PP3, Difco Laboratories, 
Detroit, Michigan) in Milli-Q* deionized water. Seed culture 
flasks consisted of 175 ml medium placed in a 500 ml tribaffied 
10 flask with a Delong neck, covered with a Kaput and autoclaved 
for 20 minutes, T=250<>F. Production flasks consisted of 500 mis 
in a 2.8 liter 500 ml tribaffied flask with a Delong neck, 
covered by a Shin-etsu silicon foam closure. These were 
autoclaved for 45 minutes, T=250'f. The seed culture was 
15 incubated at 28"c at 150 rpm in a gyrotory shaking incubator with 
a 2 inch throw* After 16 hours of growth, 1% of the seed culture 
was placed in the production flask which was allowed to grow cor 
24 hours before harvest. Production of the toxin appears to be 
during log phase growth. The microbial broth was transferred to 
20 a IL centrifuge bottle and the cellular biomass was pelleted i30 
minutes at 2500 RPM at 4*C, (R.C.F. = -1600] HG-4L Rotor RC3 
Sorval centrifuge, Dupont, Wilmington, Delaware). The primary 
broth was chilled at 4*C for 8-16 hours and recentrif uged at 
least 2 hours {conditions above) to further clarify the broth by 
25 removal of a putative mucopolysaccharide which precipitated upon 
standing. (An alternative processing method combined both i^teps 
and involved the use of a 16 hour clarification centrifugation. 
same conditions as above.) This broth was then stored at 4'C 
prior to bioassay or filtration. 
30 Photorhabdus culture broth and protein toxin (s) purifie'.l 

from this broth showed activity (mortality and/or growth 
inhibition, reduced adult emergence) against a number of insects. 
More specifically, the activity is seen against corn rootworm 
(larvae and adult), Colorado potato beetle, and turf grubs, which 
35 are members of the insect order Coleopcera. Other members of the 
Coleopcera include wireworms, pollen beetles, flea beetles, seed 
beetles and weevils. Activity has also been observed against 
aster leaf hopper, which is a member of the order, Homopcera. 
Other members of the Homopcera include planthoppers , pear pyslla, 
40 apple sucker, scale insects, whiteflies, and spittle bugs, as 
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well as numerous host specific aphid species. The broth and 
purified fractions are also active against beet armyworm. cabbage 
looper, black cutworm, tobacco budworm. European corn borer, corn 
earworm, and codling moth, which are members of the order 
5 Lepidopcera. other typical members of this order are clothes 
moth, Indian roealmoth, leaf rollers, cabbage worm, cotton 
bollworm, bagworm, Eastern tent caterpillar, sod webworm, and 
fall armyworm. Activity is also seen against fruitfly and 
mosquito larvae, which are members of the order Diptera, other 
members of the order Dipt era are pea midge, carrot fly, cabbage 
root fly, turnip root fly, onion fly, crane fly, house fly. and 
various mosquito species. Activity is seen against carpenter ant 
and Argentine ant, which are members of the order that also 
includes fire ants, oderous house ants, and little black ants. 

The broth/fraction is useful for reducing populations of 
insects and were used in a method of inhibiting an insect 
population. The method may comprise applying to a locus of the 
insect an effective insect inactivating amount of the active 
described. Results are reported in Table 3, 

Activity against corn rootworm larvae was tested as follows. 
Photorhabdus culture broth (filter sterilized, cell- free) or 
purified HPLC fractions were applied directly to the surface 
(-1.5 cm^) of 0.25 ml of artificial diet in 30 jil aliquots 
following dilution in control medium or 10 roM sodium phosphate 
buffer, pH 7.0, respectively. The diet plates were allowed to 
air-dry in a sterile flow-hood and the wells were infested with 
single, neonate Diabrotica undecimpunctaca howardi (Southern corn 
rootworm, SCR) hatched from sterilized eggs, with second instar 
SCR grown on artificial diet or with second instar Diabrocica 
virgifera virgifera (Western corn rootworm, WCR) reared on corn 
seedlings grown in Metromix'. Second instar larvae were weighed 
prior to addition to the diet. The plates were sealed, placed in 
a humidified growth chamber and maintained at 21^Q for the 
appropriate period U days for neonate and adult SCR, 2-5 days 
for WCR larvae, 7-14 days for second instar SCR). Mortality and 
weight determinations were scored as indicated. Generally, 16 
insects per treatment were used in all studies. Control 

mortalities were as follows: neonate larvae, <5%, adult beetles, 
5%. 



-27- 



10 



IS 



^"'"'"^^ PCT/US9M8003 

Activity against Colorado potato beetle was tested as 
follows. Phocorhabdus culture broth or control medium was applied 
to the surface (-^2.0 cm-) of 1.5 ml of standard artificial diet 
held in the wells of a 24-well tissue culture plate. Each well 
5 received 50 p.! of treatment and was allowed to air dry. 

Individual second instar Colorado potato beetle iLepcinocarsa 
decemllneaca, CPB) larvae were then placed onto the diet and 
mortality was scored after 4 days. Ten larvae per treatment were 
used in all studies. Control mortality was 3.3%. 

Activity against Japanese beetle grubs and beetles was 
tested as follows. Turf grubs iPopillia japonica. 2-3rd instar) 
were collected from infested lawns and maintained in the 
laboratory in soil/peat mixture with carrot slices added as 
additional diet. Turf beetles were pheromone- trapped locally and 
maintained in the laboratory in plastic containers with maple 
leaves as food. Following application of undiluted Phocorhabdus 
culture broth or control medium to corn rootworm artificial diet 
(30 jil/1.54 cm-, beetles) or carrot slices (larvae), both stages 
were placed singly in a diet well and observed for any mortality 
and feeding. in both cases there was a clear reduction in the 
amount of feeding (and feces production) observed. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96-well microtiter plate. Each well 
contained 200 jil of aqueous solution { Photorhabdus culture broth, 
25 control medium or H2O) and approximately 20, 1-day old larvae 

{Aedes aegypci). There were 6 wells per treatment. The results 
were read at 2 hours after infestation and did not change over 
the three day observation period. No control mortality was seen. 

Activity against fruitflies was tested as follows. 
Purchased Drosophila melanogaster medium was prepared using 50% 
dry medium and a 50% liquid of either water, control medium or 
Phocorhabdus culture broth. This was accomplished by placing 
8.0 ml of dry medium in each of 3 rearing vials per treatment and 
adding 8.0 ml of the appropriate liquid. Ten late instar 
Drosophila melanogaster maggots were then added to each vial. 
The vials were held on a laboratory bench, at room temperature, 
under fluorescent ceiling lights. Pupal or adult counts were 
made after 3, 7 and 10 days of exposure. Incorporation of 
Phocorhabdus culture broth into the diet media for fruitfly 
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maggots caused a slight (17%) but significant reduction in day-io 
adult emergence as compared to water and control medium (3% 
reduction) . 

Activity against aster leaf hopper was tested as follows. 
5 The ingestion assay for aster leafhopper (Macrosceles severini) 
is designed to allow ingestion of the active without other 
external contact. The reservoir for the active/ -food- solution 
is made by making 2 holes in the center of the bottom portion of 
a 35 X 10 mm Petri dish. A 2 inch Parafilm M* square is placed 
across the top of the dish and secured with an -O- ring. A 1 oz. 
plastic cup is then infested with approximately 7 leaf hoppers and 
the reservoir is placed on top of the cup, Parafilm down. The 
test solution is then added to the reservoir through the holes. 
In tests using undiluted Phocorhabdus culture broth, the broth 
and control medium were dialyzed against water to reduce control 
mortality. Mortality is reported at day 2 where 26.5% control 
mortality was seen. In the tests using purified fractions (200 
mg protein/ml ) a final concentration of 5% sucrose was used in 
all treatments to improve survivability of the aster leaf hoppers. 
The assay was held in an incubator at 2B^C, 70% RH with a 16/8 
photoperiod. The assay was graded for mortality at 72 hours. 
Control mortality was 5.5%. 

Activity against Argentine ants was tested as follows, A 
1.5 mi aliquot of 100% Phocorhabdus culture broth, control medium 
or water was pipetted into 2.0 ml clear glass vials. The vials 
were plugged with a piece of cotton dental wick that was 
moistened with the appropriate treatment. Each vial was placed 
into a separate 60xl6mm Petri dish with 8 to 12 adult Argentine 
ants iLlnepichema humile) . There were three replicates per 
treatment. Bioassay plates were held on a laboratory bench, at 
room ten^erature under fluorescent ceiling lights. Mortality 
readings were made after 5 days of exposure. Control mortality 
was 24%. 

Activity against carpenter ant was tested as follows. Black 
35 carpenter ant workers (Camponotus pennsylvanicus) were collected 
from trees on DowElanco . property in Indianapolis, in. Tests with 
Phocorhabdus culture broth were performed as follows. Each 
plastic bioassay container (7 1/8- x 3-) held fifteen workers, a 
paper harborage and 10 ml of broth or control media in a plastic 
shot glass. A cotton wick delivered the treatment to the ants 
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through a hole in che shot glass lid. All treatments contained 
5% sucrose. Bioassays were held in che dark at room temperature 
and graded at 19 days. Control mortality was 9%. Assays 
delivering purified fractions utilized artificial ant diet mixed 
S with the treatment (purified fraction or control solution) at a 
rate of 0.2 ml treatment /2 . 0 g diet in a plastic test tube. The 
final protein concentration of the purified fraction was less 

than 10 ^g/g diet. Ten ants per treatment » a water source, 
harborage and the treated diet were placed in sealed plastic 
10 containers and maintained in the dark at 27^0 in a humidified 

incubator. Mortality was scored at day 10. No control mortality 
was seen. 

Activity against various lepidopteran larvae was tested as 
follows. Photorhabdus culture broth or purified fractions were 
IS applied directly to the surface (-l.S cm^) of 0.25 ml of standard 

artificial diet in 30 ^1 aliciuots following dilution in control 
medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. 
The diet plates were allowed to air-dry in a sterile flow-hood 
and the wells were infested with single, neonate larva. European 

20 corn borer iOscrinla nubilalis) and corn earworm {Helicoverpa 
zea) eggs were supplied from commercial sources and hatched in- 
house, whereas beet armyworm iSpodoptera exigfua) , cabbage looper 
(Trichaplusia ni) , tobacco budworm {Heliothis virescens) , codling 
moth {Laspeyresia pomonella) and black cutworm (Agrotis ipsilon) 

25 larvae were supplied internally. Following infestation with 
larvae, the diet plates were sealed, placed in a humidified 
growth chamber and maintained in the dark at 27<'C for the 
appropriate period. Mortality and weight determinations were 
scored at days 5-7 for Photorhabdixs culture broth and days 4-7 

30 for the purified fraction. Generally, 16 insects per treatment 
were used in all studies. Control mortality ranged from 4-12.5% 
for control medium and was less than 10% for phosphate buffer. 
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Table 3 



Effecc of Phocorhabdus luminescens (strain W-14i 
Culture Broth and Purified Toxin Fraction on Mortality and Growth 

Inhibition of Different Insect Orders /Species 



Insect Order/ Species 


1 Broth 


1 t^U£^X£XeQ 


Fraction 




1 ? Mort . 


\ % G.I. 


1 It Mr>r-r 

1 % noru • 


1 % G.I. 


COLSOPTBRA 










Corn Rootworm 










^wui-iiefn/ n^onace xarva 


1 100 

1 A V V 


1 nd 


100 


na 


Southern/2"^ instar 


na 


38.5 


I nt 


1 


^out nern/ aauit 


1 A p 

45 


nt 


nt 


nt 


Western/2"^ instar 


na 


35 


nt 


nt 












Beetle 


93 


nt 


nt 


nt 


2"^* instar 










Turf Grub 




a • £ • 




nt 


3'** instar . 


na 


1 a f 


nt 


1 n r 


adult 










OZPTBRA 










Fruit Fly (adult 


17 


nt 


nt 


nt 


emergence) 


100 


na 


nt 


nt 


Mosquito larvae 










HOMOPTBAA 










Aster Leafhopper 


96,5 


na 


100 


na 


KYKBNOPTBRA [ 










Argentine Ant 


75 


na I 


nt 


na 


Carpenter Ant 


71 


na 


100 


na 



LBPZDOPTBRA 

Beet Armyworm 
Black Cutworm 
Cabbage Looper 
Codling Moth 
Corn Earworm 
European Corn 
Tobacco Budworm 



12.5 


36 


18.75 


41.4 


nt 


nt 




71.2 


nt 


nt 


21.9 


66.8 


nt 


nt 


6.25 


45.9 


56.3 


94.2 


97,9 


na 


96.7 


98.4 


100 


na 


13.5 


52.5 


19.4 


85.6 



Mort. = mortality, G.I. 
na = not applicable, nt 



growth inhibition, 

not tested, a.f. s anti-feedant 
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Example 3 

Insecticide Utility Upon Soil Application 

Photorhabdus luminescens (strain w-i4) culture broth was 
shown to be active against corn rootworm when applied directly to 
soil or a soil-mix (Metromix*) . Activity against neonate SCR and 
WCR in Metromix* was tested as follows (Table 4). The test was 
run using corn seedlings (United Agriseeds brand CL614) that were 
germinated in the light on moist filter paper for 6 days. After 
roots were approximately 3-6 cm long, a single kernel /seedling 
was planted in a 591 ml clear plastic cup with 50 gm of dry 
Metromix*. Twenty neonate SCR or WCR were then placed directly on 
the roots of the seedling and covered with Metromix*. Upon 
infestation, the seedlings were then drenched with 50 ml total 
volume of a diluted broth solution. After drenching, the cups 
were sealed and left at room temperature in the light for 7 days. 
Afterwards, the seedlings were washed to remove all Metromix* and 
the roots were excised and weighed. Activity was rated as the 
percentage of corn root remaining relative to the control plants 
and as leaf damage induced by feeding. Leaf damage was scored 
visually and rated as either or with - 

representing no damage and representing severe damage. 

Activity against neonate SCR in soil was tested as follows 
(Table 5) . The test was run using corn seedlings (United 
Agriseeds brand CL614) that were germinated in the light on moist 
filter paper for 6 days. After the roots were approximately 3-6 
cm long, a single kernel /seedling was planted in a 591 ml clear 
plastic cup with 150 gro of soil from a field in Lebanon, IN 
planted the previous year with com. This soil had not been 
previously treated with insecticides. Twenty neonate SCR were 
then placed directly on the roots of the seedling and covered 
with soil. After infestation, the seedlings were drenched with 
50 ml total volume of a diluted broth solution. After drenching, 
the unsealed cups were incubated in a high relative humidity 
chamber (80%) at 78®F. Afterwards, the seedlings were washed to 
remove all soil and the roots were excised and weighed. Activity 
was rated as the percentage of corn root remaining relative to 
the control plants and as leaf damage induced by feeding. Leaf 
damage was scored visually and rated as either or ^4.4., 

with - representing no damage and representing severe damage. 
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Table 4 

Effect of Phocorhabdus luminescens (strain W-14 1 Culture 
Broth on Rootworm Larvae After Post-Infestation Drenching 

(Metromix*) 



Traatarat 



Larvae Laaf Damaga 



Soutbarn Corn Rootworm 

10 water 

Medium (2.0% v/v) - 

Broth (6,25%v/v) - 

Water ♦ 

IS Media (2.0% v/v) ^ 

Broth (1.56% v/v) + 



25 



Wastars Com Aootwora 

Water - 
Broth (2.0% v/v) - 

Water ♦ 
Broth (2.0% v/v) 



+ -f ♦ 



Root Waight (g) 



0.4916 ± 0.023 
0.4416 ± 0.029 
0.4641 ± 0.081 

0.1410 ± 0.006 
0.1345 ± 0.028 

0.4830 i 0.031 



0.4446 ± 0.019 
0.4069 t 0.026 

0.2202 ± 0.015 
0.3879 t 0.013 



100 
100 
100 

28 
30 

104 



100 
100 

49 

95 



7 
4 



30 



Table 5 

Effect of Photorhabdus luminescens (strain W-14) Culture Broth on 
Southern Corn Rootworm Larvae After Post -Infestation Drenching 

(Soil) 



Traatmant 



Laaf Daaaga Root Waight(g) 



35 



Water 

Broth (50% v/v) 



0.2148 ± 0.014 
0.2260 ± 0.016 



100 
103 



Water 

Broth (50% v/v) 



+ 



0,0916 t 0.009 
0.2428 ± 0.032 



43 
113 



40 Activity of Photorhabdus luminescens (strain W-14) culture 

broth against second instar turf grubs in Metromix^ was observed 
in tests conducted as follows (Table 6) . Approximately 50 gm of 
dry Metromix"* was added to a 591 ml clear plastic cup. The 
Metromix* was then drenched with 50 ml total volume of a 50% (v/v) 

45 diluted Photorhabdus broth solution. The dilution of crude broth 
was made with water, with 50% broth being prepared by adding 25 
mi of crude broth to 25 ml of water for 50 ml total volume. A 1% 
(w/v) solution of proteose peptone #3 (PP3)« which is a 50% 
dilution of the normal media concentration, was used as a broth 

SO control. After drenching, five second instar turf grubs were 
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placed on the cop of the moistened Metromix*. Healthy turf grub 
larvae burrowed rapidly into the Metromix*. Those larvae that did 
not burrow within Ih were removed and replaced with fresh larvae. 
The cups were sealed and placed in a 28^C incubator, in the dark. 
5 After seven days, larvae were removed from the Metromix* and 
scored for mortality. Activity was rated the percentage of 
mortality relative to control. 



10 Table 6 

Effect of Phocorhabdus lumlnescens (strain W-14) Culture Broth on 
Turf Grub After Pre- Infestation Drenching (Metromix*) 



15 



20 



25 



Treatment Mortality* Mortality % 

Water 7/15 47 
Control medium 

(1.0% w/v) 12/19 63 
Broth 

(50% v/v) 17/20 85 
^expressed as a ratio of dead/ living larvae 

Example 4 

Insecticide Utility Upon Leaf Application 



30 Activity of Phocorhabdus broth against European corn borer 

was seen when the broth was applied directly to the surface of 
maize leaves (Table 7) . in these assays Phocorhabdus broth was 
diluted 100- fold with culture medium and applied manually to the 

surface of excised maize leaves at a rate of -6.0 |il/cm^ of leaf 
35 surface. The leaves were air dried and cut into equal sized 
strips approximately 2x2 inches. The leaves were rolled, 
secured with paper clips and placed in 1 oz plastic shot glasses 
with 0.25 inch of 2% agar on the bottom surface to provide 
moisture. Twelve neonate European corn borers were then placed 
40 onto the rolled leaf and the cup was sealed. After incubation 
for 5 days at 27^c in the dark, the samples were scored for 
feeding damage and recovered larvae. 
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Table 7 

Effect of Photorhabdus luminescens (strain W-i4) Culture Broth on 
European Corn Borer Larvae Following Pre- Infestation Application 

to Excised Maize Leaves 

5 

Treatmant Leaf Daaaga Larvae Recovered Weight (mg) 

Water Extensive 55/120 0.42 mg 

Control Medium Extensive 40/120 0.50 mg 

Broth (1.0% v/v) Trace 3/120 0.15 mg 

10 

Activity of the culture broth against neonate tobacco 
budworm {Hellochis virescens) was demonstrated using a leaf dip 
methodology. Fresh cotton leaves were excised from the plant and 
leaf disks were cut with an IB. 5 mm cork-borer. The disks were 

15 individually emersed in control medium (PP3) or Photorhabdus 
luminescens (strain W-14) culture broth which had been 
concentrated approximately 10-fold using an Amicon (Beverly, MA), 
Proflux M12 tangential filtration system with a 10 kOa filter. 
Excess liquid was removed and a straightened paper clip was 

20 placed through the center of the disk. The paper clip was then 
wedged into a plastic, 1.0 oz shot glass containing approximately 
2.0 ml of 1% Agar. This served to suspend the leaf disk above 
the agar. Following drying of the leaf disk, a single neonate 
tobacco budworm larva was placed on the disk and the cup was 

25 capped. The cups were then sealed in a plastic bag and placed in 
a darkened, 21^C incubator for 5 days. At this time the 
remaining larvae and leaf material were weighed to establish a 
measure of leaf damage (Table 8) . 

30 Table 8 

Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on 
Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay 

Final Weights (ag) 
35 Treataant Leaf Diak Larvae 

Control leaves 55.7 t 1.3 na* 

Control Medium 34.0±2.9 4.3i:0.91 

Phocorhabdus broth 54.3il.4 O.O** 

• - not applicable, *♦ - no live larvae found 

40 
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Example 5. Pare A 
Characterization of Toxin Pepcide Components 

In a subsequent analysis, the toxin protein subunits of the 
S bands isolated as in Example 1 were resolved on a 7% SDS 
polyacrylamide electrophoresis gel with a ratio of 30:0.8 
(acrylamide:BIS-acrylamide) . This gel matrix facilitates better 
resolution of the larger proteins. The gel system used to 
estimate the Band 1 and Band 2 subunit molecular weights in 
10 Example 1 was an 18% gel with a ratio of 38:0.18 (acrylamide: BIS- 
acrylamide), which allowed for a broader range of size 
separation* but less resolution of higher molecular weight 
components . 

In this analysis, 10* rather than 8, protein bands were 
15 resolved. Table 9 reports the calculated molecular weights of 
the 10 resolved bands, and directly compares the molecular 
weights estimated under these conditions to those of the prior 
example. It is not surprising that additional bands were 
detected under the different separation conditions used in this 

20 example. Variations between the prior and new estimates of 

molecular weight are also to be expelcted given the differences in 
analytical conditions. In the analysis of this example, it is 
thought that the higher molecular weight estimates are more 
accurate than in Example 1, as a result of improved resolution. 

25 However, these are estimates based on SDS PAGE analysis, which 

are typically not analytically precise and result in estimates of 
peptides and which may have been further altered due to post- and 
co-translational modifications. 

Amino acid sequences were determined for the N-terminal 

30 portions of five of the 10 resolved peptides. Table 9 correlates 
the molecular weight of the proteins and the identified 
sequences. In SEQ ID NO: 2, certain analyses suggest that the 
proline at residue 5 may be an asparagine (asn) . In SEQ ID NO: 3, 
certain analyses suggest that the amino acid residues at 

35 positions 13 and 14 are both arginine (arg) . In SEQ ID N0:4, 

certain analyses suggest that the amino acid residue at position 
6 may be either alanine (ala) or serine (ser) . In SEQ ID N0:5. 
certain analyses suggest that the amino acid residue at position 
3 may be aspartic acid (asp) . 

40 
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Table 9 



EXAMPLE 1 
ESTI14ATE 



NEW ESTIMATE* 



SEQ. LISTING 
SEQ ID N0:1 



208 



200.2 kDa 



5 



184 175.0 kDa SEQ ID NO: 2 

65.6 68.1 kDa SEQ ID NO: 3 

60.8 65.1 kDa SEQ ID NO: 4 

56.2 58.3 kDa SEQ ID NO: 5 

25.1 23.2 kDa SEQ ID NO: 15 



10 



*New estimates are based on SDS PAGE and are not based on 
gene sequences. SDS PAGE is not analytically precise. 



Example 5, Part B 
Characterization of Toxin Peptide Components 



15 



New N- terminal sequence r SEQ ID NO: 15, Ala Gin Asp Gly Asn 
Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N- 
terminal sequencing of peptides isolated from Native HPLC- 
purified toxin as described in Example 5, Part A« above. This 
2U peptide comes from the tcaA gene. The peptide labeled TcaAn, 

starts at position 254 and goes to position 491, where the 
TceU^iii peptide starts « SEQ ID NO: 4. The estimated size of the 

peptide based on the gene sequence is 2 5,240 Da. 

25 Example 6 

Characterization of Toxin Peptide Components 

In yet another analysis, the toxin protein complex was re- 
isolated from the Photorhatdus luminescens growth medium (after 

30 culture without Tween) by performing a 10% - 80% ammonium sulfat 
precipitation followed by an ion exchange chromatography step 
(Mono Q) and two molecular sizing chromatography steps. These 
conditions were like those used in Example 1. During the first 
molecular sizing step, a second biologically active peak was 

35 found at about 100 ± 10 kDa. Based upon protein measurements, 
this fraction was 20 - 50 fold less active than the larger, or 
primary, active peak of about 860 t 100 kDa (native) . During 
this isolation experiment, a smaller active peak of about 325 t 
50 kDa that retained a considerable portion of the starting 

40 biological activity was also resolved. It is thought that the 
325 kDa peak is related to or derived from the 860 kDa peak. 
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A 56 kDa protein was resolved in this analysis. The N- 
terminai sequence of this protein is presented in SEQ id HO: 6. 
It is noteworthy that this protein shares significant identic/ 
and conservation with SEQ ID NO: 5 at the N-terminus. suggestiii,i 
5 that the two may be encoded by separate members of a gene family 
and that the proteins produced by each gene are sufficiently 
similar to both be operable in the insecticidal toxin complex. 

A second, prominent 185 kDa protein was consistently present 
in amounts comparable to that of protein 3 from Table 9. and may 
be the same protein or protein fragment. The N-terminal sequence 
of this 185 kDa protein is shown at SEQ id nO:7. 

Additional N-terminal amino acid sequence data were also 
obtained from isolated proteins. None of the determined N- 
terminal sequences appear identical to a protein identified in 
Table 9. other proteins were present in isolated preparation, 
one such protein has an estimated molecular weight of 108 kDa and 
an N-terminal sequence as shown in SEQ ID NO: 8. A second such 
protein has an estimated molecular weight of 80 kDa and an N- 
terminal sequence as shown in SEQ ID NO: 9. 

When the protein material in the approximately 325 kDa 
active peak was analyzed by size, bands of approximately 51, Ji, 
28, and 22 kDa were observed. As in all cases in which a 
molecular weight was determined by analysis of electrophoretic 
mobility, these molecular weights were subject to error effect.? 
introduced by buffer ionic strength differences, electrophoresis 
power differences, and the like. One of ordinary skill would 
understand that definitive molecular weight values cannot be 
determined using these standard methods and that each was subject 
to variation, it was hypothesized chat proteins of these sizes 
are degradation products of the larger protein species (of 
approximately 200 kDa size) that were observed in the larger 
primary toxin complex. 

Finally, several preparations included a protein having die 
N-terminal sequence shown in SEQ ID NO: 10. This sequence was 
strongly homologous to known chaperonin proteins, accessory 
proteins known to function in the assembly of large protein 
complexes. Although the applicants could not ascribe such an 
assembly function to the protein identified in SEQ ID NO: 10, ic 
was consistent with the existence of the described toxin protein 
complex that such a chaperonin protein could be involved in its 
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assembly. Moreover, although such proteins have not directly 
been suggested to have toxic activity, this protein may be 
important to determining the overall structural nature of the 
protein toxin, and thus, may contribute to the toxic activity or 
durability of the complex in vivo after oral delivery. 

Subsequent analysis of the stability of the protein toxin 
complex to proteinase K was undertaken. It was determined that 
after 24 hour incubation of the complex in the presence of a 10- 
fold molar excess of proteinase K. activity was virtually 
eliminated (mortality on oral application dropped to about 5%). 
These data confirm the proteinaceous nature of the toxin. 

The toxic activity was also retained by a dialysis membrane, 
again confirming the large size of the native toxin complex. 

•5 Example 7 

Isolation, Characterization and Partial Amino Acid 
Sequencing of P/iotorAaJbdus Toxins 



10 



20 



Isolation and N- Terminal Amino Acid Sequencing ; In a set of 
experiments conducted in parallel to Examples 5 and 6, ammonium 
sulfate precipitation of Photorhabdus proteins was performed by 
adjusting Phocorhabdus broth, typically 2-3 liters, to a final 
concentration of either 10% or 20% by the slow addition of 
ammonium sulfate crystals. After stirring for 1 hour at 4«C, the 
25 material was centrifuged at 12,000 x g for 30 minutes. The 

supernatant was adjusted to 80% ammonium sulfate, stirred at 4<'c 
for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The 
pellet was resuspended in one- tenth the volume of 10 mM Na:«P04, 
pH 7.0 and dialyzed against the same phosphate buffer overnight 
30 at 4<>C. The dialyzed material was centrifuged at 12.000 x g for 
1 hour prior to ion exchange chromatography. 

A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was 
equilibrated with 10 mM Nd2»P04, pH 7.0. Centrifuged, dialyzed 
ammonium sulfate pellet was applied to the Q Sepharose column at 
35 a rate of 1.5 ml/min and washed extensively at 3.0 ml/min with 

equilibration buffer until the optical density (O.D. 280) reached 
less than 0.100. Next, either a 60 minute NaCl gradient ranging 
from 0 to 0.5 M at 3 ml/min, or a series of step elutions using 
0.1 M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied 
40 to the column. Fractions were pooled and concentrated using a 
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Centriprep 100. Alternatively, proteins could be eluted by a 
single 0.4 M NaCl wash without prior elution with 0.1 m NaCl. 

Two milliliter aliquots of concentrated Q Sepharose samples 
were loaded at 0.5 ml/min onto a HR 16/50 Superose 12 (Pharmacia) 
5 gel filtration column equilibrated with 10 mM Na2«P04. pH 7.0. 
The column was washed with the same buffer for 240 min at 0.5 
ml/min and 2 min samples were collected. The void volume 
material was collected and concentrated using a Centriprep loo. 
Two milliliter aliquots of concentrated Superose 12 samples were 
loaded at 0,5 ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) 
gel filtration column equilibrated with 10 mM Na^^PO^, pH 7.0. 
The column was washed with the same buffer for 240 min at 0.5 
ml/min and 2 min samples were collected. 

The excluded protein peak was subjected to a second 
fractionation by application to a gel filtration column that used 
a Sepharose CL-4B resin, which separates proteins ranging from 
-30 kDa to 1000 kDa. This fraction was resolved into two peaks; 
a minor peak at the void volume (>1000 kDa) and a major peak 
which eluted at an apparent molecular weight of about 860 kDa. 
Over a one week period subsequent samples subjected to gel 
filtration showed the gradual appearance of a third peak 
(approximately 325 kDa) that seemed to arise from the major peak, 
perhaps by limited proteolysis. Bloassays performed on the three 
peaks showed that the void peak had no activity, while the 860 
25 kDa toxin complex fraction was highly active, and the 325 kDa 
peak was less active, although quite potent. SDS PAGE analysis 
of Sepharose CL-4B toxin complex peaks from different 
fermentation productions revealed two distinct peptide patterns, 
denoted -P- and "S". The two patterns had marked differences in 
the molecular weights and concentrations of peptide components in 
their fractions. The "S" pattern, produced most frequently, had 
4 high molecular weight peptides (> 150 kDa) while the "P- 
pattern had 3 high molecular weight peptides. In addition, the 
"S* peptide fraction was found to have 2-3 fold more activity 
against European Corn Borer. This shift may be related to 
variations in protein expression due to age of inoculum and/or 
other factors based on growth parameters of aged cultures. 

Milligram quantities of peak toxin complex fractions 
determined to be -P- or -S" peptide patterns were subjected to 
preparative SDS PAGE, and transblotted with TRiS-glycine 
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(Seprabuff™ to PVDF membranes (ProBlocc~. Applied Biosystems) 
for 3-4 hours. Blocs were sent for amino acid analysis and N- 
cerminal amino acid sequencing at Harvard MicroChem and Cambridge 
ProChem. respectively. Three peptides in the -S- pattern had 
unique N-terminal amino acid sequences compared to the sequences 
identified in the previous example. A 201 kDa (TcdAii) peptide 
set forth as SEQ ID NO:. 13 below shared between 33% amino acid 
identity and 50% similarity with SEQ ID N0;1 (TcbAiU (Table 10. 
in Table 10 vertical lines denote amino acid identities and 
colons indicate conservative amino acid substitutions), a second 
peptide of 197 kDa. SEQ ID N0:14 (TcdB) , had 42% identity and 58% 
homology with SEQ ID N0:2 (TcaC) . Yet a third peptide of 205 kDa 
was denoted TcdAn. in addition, a limited N-terminal amino acid 
sequence. SEQ ID NO: 16 (TcbA) , of a peptide of at least 235 kDa 
was identical in homology with the amino acid sequence, SEQ id 
NO: 12, deduced from a cloned gene (tcbA). SEQ ID NO: 11, 
containing a deduced amino acid sequence corresponding to SEQ ID 
NO!l (TcbAii). This indicates that the larger 235+ kDa peptide 
was proteolytically processed to the 201 kDa peptide. (TcbAii), 
(SEQ ID N0:1) during fermentation, possibly resulting in 
activation of the molecule. In yet another sequence, the 
sequence originally reported as SEQ ID NO: 5 (TcaBii) reported in 
Example 5 above, was found to contain an aspartic acid residue 
(Asp» at the third position rather than glycine (Gly) and two 
additional amino acids Gly and Asp at the eighth and ninth 
positions, respectively, in yet two other sequences, SEQ ID N0:2 
(TcaC) and SEQ ID NO: 3 (TcaB.) . additional amino acid sequence was 
obtained. Densitometric quantitation was performed using a 
sample that was identical to the "S" preparation sent for N- 
terminal analysis. This analysis showed that the 201 kDa and 197 
kDa peptides represent 7.0% and 7.2%. respectively, of the total 
Coomassie brillant blue stained protein in the -S" pattern and 
are present in amounts similar to the other abundant peptides. 
It is speculated that these peptides may represent protein 
homologs. analogous to the situation found with other bacterial 
toxins, such as various Cryl Bt toxins. These proteins vary from 
40-90% homology at their N-terminal amino acid sequence, which 
encompasses the toxic fragment. 
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Internal Amino Acid SeoueneinQi to faciiitace cloning of 
toxin peptide genes, internal amino acid sequences oC selected 
peptides were obtained as followed. Milligram quantities of peak 
2A fractions determined to be -P- or -S- peptide patterns were 
5 subjected to preparative SDS PACE, and transblotted with TRis- 
glycine (Seprabuff~ to PVOF membranes (ProBlott™. Applied 
Biosystems) for 3-4 hours. Blots were sent for amino acid 
analysis and N- terminal amino acid sequencing at Harvard 
MicroChem and Cambridge ProChem, respectively. Three peptides, 
referred to as TcbAii (containing SEQ ID N0:1). TcdAii. and TcaB, 
(containing SEQ id N0:3) were subjected to trypsin digestion by ' 
Harvard MicroChem followed by HPLC chromatography to separate 
individual peptides. N-terminal amino acid analysis was 
performed on selected tryptic peptide fragments. Two internal 
peptides were sequenced for the peptide TcaB. (205 kDa peptide) 
referred to as TcaB,-PTlll (SEQ ID N0:17) and TcaB,.PT79 (SEQ ID 
NO: 18). Two internal peptides were sequenced for the peptid^ 
Tcafli (68 kDa peptide) referred to as TcaB,-PTlS8 (SEQ id NO: 19) 
and TcaB,-PT108 (SEQ ID NO;20) . Four internal peptides were 
sequenced for the peptide TcbAii (201 kDa peptide) referred to as 
TCBAII-PTIOJ (SEQ ID N0:21) . TcbAii-PTS6 (SEQ ID NO:22) . TcbAu- 
PT81(a) (SEQ ID NO:23). and TcbAii-PT81 (b) (SEQ ID NO:24) . 
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Table 10 
N-Terminal Amino Acid Sequences 

i^a v^m*m'^^c'c'X Similarity to SEQ ID NO.l) 

L I G y N N 2 F S G • A SEQ ID NO: 13 
: I I I : I 

F I Q G Y S D L F G N - A SEQ ID NO: 1 

197 kDa (42% identity & 58% similarity SEQ ID NO 2) 
MQNSQTFSVGEL SEQ ID NO. 14 
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S£Q ID NO. 2 
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Example 8 

[ gnomxc DNA _ ^and its screening to is olate gen^^ encoding o^n ^ i^ 

comprisin g the toxic protein preoarati^K — ^ 

As a prerequisite for the production of Pliocorhabdus insect 
toxic proteins in heterologous hosts, and for other uses, it is 
necessary to isolate and characterize the genes that encode those 
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pepcides. rnis oo^eccive was purauea in parallel, one approach, 
described later, was based on the use of monoclonal and 
polyclonal antibodies raised against the purified toxin which 
were then used to isolate clones from an expression library. The 
5 other approach, described in this example, is based on the use of 
the N-tenninal and internal amino acid sequence data to design 
degenerate oligonucleotides for use in PCR amplication. Either 
method can be used to identify DMA clones that contain the 
peptide-encoding genes so as to permit the isolation of the 
respective genes, and the determination of their dna base 
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GENOMIC DNA ISOLATION; Photorhabdus lumxaesceas strain W-14 
(ATCC accession number 55397) was grown on 2% proteose peptone #3 
agar (Difco Laboratories, Detroit, MI) and insecticidal toxin 
competence was maintained by repeated bioassay after passage, 
using the method described in Example 1 above. A SO mi shalce 
culture was produced in a 175 ml baffled flask in 21 proteose 
peptone #3 medium, grown at 28«»C and 150 rpm for approximately 24 
hours. 15 ml of this culture was pelleted and frozen in its 
medium at -20«>C until it was thawed for DNA isolation. The 
thawed culture was centrifuged. (700 x g, 30 min) and the 
floating orange mucopolysaccharide material was removed. The 
remaining cell material was centrifuged (25.000 x g. is min) to 
pellet the bacterial cells, and the medium was removed and 
discarded. 

Genomic DNA was isolated by an adaptation of the CTAB method 
described in section 2.4.1 of Current Protocols in Molecular 
Biology (Ausubel ec ai. eds, John Wiley t, Sons. 1994) [modified 
CO include a salt shocJc and with all volumes increased 10-foldJ . 
The pelleted bacterial ceils were resuspended in TE buffer (10 mM 
Tris-HCl, 1 mM EDTA. pH 8.0) to a final volume of 10 ml, then 12 
ml of 5 M Naci was added; this mixture was centrifuged 20 min at 
15.000 X g. The pellet was resuspended in 5.7 ml TE and 300 ml 
of 10% SDS and 60 ml of 20 mg/ml proteinase K (Gibco BRL 
Products. Grand Island. NY; in sterile distilled water) were 
added to the suspension. This mixture was incubated at 37oc for 
1 hr; then approximately 10 mg lysozyme (Worthington Biochemical 
Corp.. Freehold. NJ) was added. After an additional 45 min, 1 mi 
of 5 M NaCl and 800 ml of CTAB/NaCi solution (10% w/v CTAB. 0.7 M 
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NaCl) were aaaed. rnis preparation was mcufiated 10 min ac 63*0 
then gencly agitated and further incubated and agitated for 
approximately 20 min to assist clearing of the cellular material 
An equal volume of chloroform/ isoam/1 alcohol solution (24:1, 

5 v/v) was added, mixed gently and centrifuged. After two 
extractions with an equal volume of PCI 

(phenol/chloroform/isoara/1 alcohol; 50:49:1. v/v/v; equilibrated 
with 1 M Tris-HCl. pH 8.0; Intermountain Scientific Corporation, 
Kaysville, UT) . the DNA was precipitated with 0.6 volume of 
10 isopropanol. The DNA precipitate was gently removed with a glass 
rod. -washed twice with 70% ethanol, dried, and dissolved in 2 tal 
STE (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 1 mM EDTA) . This 
preparation contained 2.5 mg/ml DNA. as determined by optical 
density at 260 nm (i.e.. ODi«o) . 

The molecular size range of the isolated genomic DNA was 
evaluated for suitability for library construction. CHEF gel 
analysis was performed in 1.5% agarose (Seakero* LE, FMC 
BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mtl 
Tris-HCl pH 8.0. 44.5 mM HjBOj. 1 nM EDTA) on a BioRad CHEF-DR II 
apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories. 
Inc., Richmond. CA) . The running parameters were: initial A 
time. 3 sec; final A time. 12 sec; 200 volts; running 
temperature. 4-18«»C; nm time, 16.5 hr. Ethidium bromide 
staining and examination of the gel under ultraviolet light 
indicated the DNA ranged from 30-250 kbp in size. 

CONSTRUCTION OF LIBRARY: A partial Sau3A 1 digest was i;.ide 
of this Phocorhabdus genomic DNA preparation. The method was 
based on section 3.1.3 of Ausubel (supra.). Adaptions included 
running smaller scale reactions under various conditions until 
nearly optimal results were achieved. Several scaled-up large 
reactions with varied conditions were r\in, the results analyzed 
on CHEF gels, and only the best large scale preparation was 
carried forward, in the optimal case. 200 ug of Phocorijabdus 
genomic DNA was incubated with 1.5 units of Sau3A 1 (New England 
Biolabs. "NEB", Beverly, MA) for 15 min at 37«>c in 2 mi total 
volume of IX NEB 4 buffer (supplied as lOX by the manufacturer). 
The reaction was stopped by adding 2 ml of PCI and centrifuging 
at 8000 X g for 10 min. To the supernatant were added 200 nl ct 
5 M NaCl plus 6 ml of ice-cold ethanol. This preparation was 
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chiiled for 30 min at -20oc, chen centrifuged ac 12,000 m g c^r 
15 min. The supernatant was removed and the precipitate was 
dried in a vacuum oven at 40»C, then resuspended in 400 m STE. 
Speccrophocometric assay indicated about 40% recovery of the 
5 input ONA. The digested DWA was size fractionated on a sucrose 
gradient according to section 5.3.2 of CPMB (op. cic). a lO* 
to 401 (w/v) linear sucrose gradient was prepared with a gradient 
maker in Ultra-ciear~ tubes (Beckman Instruments. Inc.. Palo 
Alto, CA) and the DNA sample was layered on top. After 
centrifugation, (26.000 rpm. 17 hr. Beckman SW41 rotor. 20oc;. 
fractions (about 750 nD were drawn from the top of the gradient 
and analyzed by CHEF gel electrophoresis (as described earli-rjr). 
Fractions containing Sau3A 1 fragments in the size range 20-40 
kbp were selected and DNA was precipitated by a modification 
(amounts of all solutions increased approximately 6.3-fold) .rf 
the method in section 5.3.3 of Ausubel I supra.). After overniyht 
precipitation, the DNA was collected by centrifugation (i7.0o0 x 
g. 15 min). dried, redissolved in TE. pooled into a final volume 
of 80 m. and reprecipitated with the addition of 8 m 3 m s'-dium 
acetate and 220 ^l ethanol. The pellet collected by 
centrifugation as above was resuspended in 12 \iL TE. 
Concentration of the DNA was determined by Hoechst 33258 dye 
(Polysciences. inc.. Warrington. PA) fluorometry in a Hoefer 
TKOlOO fluorimeter (Hoefer Scientific Instruments. San Francisco. 
CA). Approximately 2.5 jig of the size-fractionated DNA was 
recovered. 

Thirty |ig of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was 
digested to completion with 100 units of restriction enzyme EamH 
1 (NEB) in the manufacturer's buffer (final volume of 200 jil. 
37«>c, 1 hr) . The reaction was extracted with 100 \iL of FCI md 
DNA was precipitated from the aqueous phase by addition of 20 jil 
3M sodium acetate and 550 jil .20»C absolute ethanol. After 10 
min at -70'C. the DNA was collected by centrifugation (17,000 x 
g. 15 min) . dried under vacuum, and dissolved in 180 nl of l-J mM 
Tris-HCl. pH 8.0. To this were added 20 )il of lOX CIP buffe.-. 
(100 mM Tris-HCl. pH 8.3; 10 mM ZnCl,; 10 mM MgCl,) , and 1 nl 
(0.25 units) of 1:4 diluted calf intestinal alkaline phosphatase 
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(Boehringer Mannheim Corporac ion « Indianapolis. IN). After 30 
min at 37oc, the following additions were made: 2 |il 0.5 M EDTA. 

pH 8.0; 10 111 10% SDS; O.S ^1 of 20 mg/ml proteinase K (as 
above) , followed by inciibation at 5S®C for 30 min. Following 
5 sequential extractions with 100 ^1 of PCI and 100 \il phenol 
(Intermountain Scientific Corporation, equilibrated with 1 M 
Tris-HCl* pH 8.0), the dephosphorylated DNA was precipitated l:y 

addition of 72 )il of 7.5 M ammonium acetate and 550 {il -20®C 
ethanol, incubation on ice for 30 min, and centrifugation as 
10 above. The pelleted DNA was washed once with 500 |il -20^C 701 

ethanol, dried under vacuum, and dissolved in 20 |il of TE buffer. 

Ligation of the size-fractionated Sau3A 1 fragments to the 
BamH 1 -digested and phosphatased pWElS vector was accomplished 
using T4 ligase (NEB) by a modification (i.e.. use of premixed 
15 lOX ligation buffer supplied by the manufacturer) of the protocol 
in section 3.33 of Ausubel. Ligation was carried out overnight 

in a total volume of 20 |il at IS^C. followed by storage at - 
20^C. 

* 

Four iLi of the cosmid DNA ligation reaction, containing 

20 about 1 (ig of DNA, was paclcaged into bacteriophage lambda usinu a 
commercial packaging extract (Gigapack* III Gold Packaging 
Extract. Stratagene) , following the manufacturer's directions. 
The packaged preparation was stored at 4^C until use. The 
packaged cosmid preparation was used to infect Escherichia coli 

25 XLl Blue MR cells (Stratagene) according to the Gigapack^ III Ojld 
protocols I'Titering the Cosmid Library"), as follows. XLl Blue 
MR cells were grown in LB medium (g/L: Bacto-tryptone, 10; Bacto- 
yeast extract. 5; Bacto-agar* 15; NaCl. 5; [Difco Laboratories, 
Detroit. HI]) containing 0.2% (w/v) maltose plus 10 mM MgSO*. at 

30 370C. After 5 hr growth, cells were pelleted at 700 x g (15 min) 
and resuspended in 6 ml of 10 mM MgS04. The culture density v/as 
adjusted with 10 mM HgS04 to OD«oo s 0.5. The packaged cosmid 
library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M 
NaCl, 10 mM MgSO«. 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), ani 

35 25 \xl of the diluted preparation was mixed with 25 ^1 of the 

diluted XLl Blue MR cells. The mixture was incubated at 25°C tor 
30 min (without shaking) , then 200 M'l of LB broth was added, and 
incubation was continued for approximately 1 hr with occasional 
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gentle shaking. Aliquots (20->40 M-^^ ot chis culture were spread 
on LB agar plates containing 100 mg/1 ampicillin (i.e., LB*An:p...>) 
and incubated overnight at 31^C. To store the library without 
amplification, single colonies were picked and inoculated into 
S individual wells of sterile 96-well microwell plates; each well 

containing 75 ^1 of Terrific Broth (TB media: 12 g/1 Bacto- 
tryptone, 24 g/1 Bacto-yeast extract, 0.4% v/v glycerol, 17 mM 
KH2PO4, 72 mM K2HP04> plus 100 mg/1 ampicillin (i.e., TB-Arapioci and 
incubated (without shaking) overnight at 37^0. After replicating 

lU the 96-well plate into a copy plate, 75 |il/well of filter- 
sterilized TB:glycerol (1:1, v/v; with, or without, ICQ mg/1 
ampicillin) was added to the plate, it was shaken briefly at 100 
rpm, 370c, and then closed with Parafilm^ (American National ':an, 
Greenwich, CT) and placed in a -70^C freezer for storage. Copy 

IS plates were grown and processed identically to the master places. 
A total of 40 such master plates (and their copies) were 
prepared . 

SCREENING OF THE LIBRARY WITH RADIOLABELED DMA PROBES ; To 

20 prepare colony filters for probing with radioactively labeled 
probes, ten 96-well plates of the library were thawed at 25^C 
(bench top at room temperature) . A replica plating tool witii ?6 
prongs was used to inoculate a fresh 96-well copy plate 

containing 75 fil/well of TB-Ampioo. The copy plate was grown 
2S overnight (stationary) at 37^C, then shaken about 30 min at 100 
rpm at 37^0 . A total of 800 colonies was represented in these 
copy plates, due to nongrowth of some isolates. The replica tool 
was used to inoculate duplicate impressions of the 96-well arrays 
onto Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 
30 220 X 250 mm) which had been placed on solid LB-Ampioo (100 

ml/dish) in Bio-assay plastic dishes (Nunc, 243 x 243 x IB mm; 
Curtin Mathison Scientific, Inc., Wood Dale, ID. The colonies 
were grown on the membranes at 37^C for about 3 hr. 

A positive control colony (a bacterial clone containing a 
3S GZ4 sequence insert, see below) was grown on a separate Magn^i tiT 
membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium 
supplemented with 35 mg/1 chloramphenicol (i.e., LB-CamtO . acid 
processed alongside the library colony membranes. Bacterial 
colonies on the membranes were lysed, and the DNA was denatured 
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and neutralized according to a protocol taken £rom the Genius'^ 
System User's Guide version 2.0 {Boehringer Mannheim. 
Indianapolis, IN). Membranes were placed colony side up on 
filter paper soaked with 0.5 N NaOH plus 1.5 M NaCl for 15 min to 
5 denature, and neutralized on filter paper soaked with 1 M Tris- 
HCl pH 8.0, 1.5 M NaCl for 15 min. After CJV-crosslinking using a 
Stratagene UV Stratalinker set on auto crosslink, the membraues 
were stored dry at 25«'C until use. Membranes were trimmed into 
strips containing the duplicate impressions of a single 96-weil 
plate, then washed extensively by the method of section 6.4.1 in 
CPMB (op. cic): 3 hr at 25»C in 3X SSC. 0.1% (w/v) SOS. followed 
by 1 hr at 65''c in the same solution, then rinsed in 2X SSC in 
preparation for the hybridization step {20X SSC = 3 m NaCl. C.3 m 
sodium citrate. pH 7.0). 

\S 

Amplification of a specific ge nomic fragment of a rr^r ■,.o n« 
Based on the N-terminal amino acid sequence determined for the 
purified TcaC peptide fraction (disclosed herein as SEQ ID N0:2), 
a pool of degenerate oligonucleotides (pool S4Psh) was 
synthesized by standard P-cyanoethyl chemistry on an Applied 
BioSystem ABI394 DNA/RNA Synthesizer (Parkin Elmer, Foster City. 
CA) . The oligonucleotides were deprotected 8 hours at SS^c. 
dissolved in water, quantitated by speccrophotometric 
measurement, and diluted for use. This pool corresponds to the 
25 determined N-terminal amino acid sequence of the Tcac peptide. 
The determined amino acid sequence and the corresponding 
degenerate DNA sequence are given below, where A, C. G. and T are 
the standard DNA bases, and I represents inosine: 

30 ^•'^ v*i 

S4P8h 5- ATG CA(A/C) GA(T/C) (T/A) (C/C) (T/A) CCI OA(A/CI GT 3 ' 

Another set of degenerate oligonucleotides was synthesized 

(pool P2.3.5R). representing the complement of the coding strand 

for Che determined amino acid sequence of the SEQ ID NO: 17: 
Amino 

^^^^ Phe Asn Il« Asp Asp Val 



20 



35 



^ IV^TU V TT(T/C) AA(T/C) AT(A/T/C) CA(T/C) CA(T/C) GT 3" 

P2.3.5R 3'CC(A/C/G/T) AA(A/C) TT(A/C) TA(T/A/C) CtIa/O) Ct1a/G) CA 5' 

These oligonucleotides were used as primers in Polymerase 
Chain Reactions (PCR*. Roche Molecular Systems. Branchburg, Hj) to 
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amplify a spec i tic una cragmenc from genomic DNA prepared from 
Phocorhabdus strain W*14 (see above). A typical reaction (50 iiD 
contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng 
of genomic template DNA, 10 nmol each of dATP, dCTP, dGTP, and 
5 dTTP, IX GeneAmp* PCR buffer, and 2,5 units of AmpliTaq' dna 

polymerase (both from Roche Molecular Systems; lOX GeneAmp* buffer 
is 100 mM Tris-HCl pH 8.3. 500 mM KCl, 0.01% w/v gelatin). 
Amplifications were performed in a Perkin Elmer Cetus ONA Thermal 
Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of SA^C 

10 (1.0 min) , SS^C (2.0 min) , 72«C (3.0 min) , followed by an 

extension period of 7.0 min at 12^C. Amplification products were 
analyzed by electrophoresis through 2% w/v NuSieve* 3:1 agarose 
(FMC BioProducts) in TEA buffer (40 mM Tris-acetate. 2 mM EDTA, 
pH 8.0). A specific product of estimated size 250 bp was 

IS observed amongst numerous other amplification products by 

ethidium bromide (0.5 ^g/ml) staining of the gel and examination 
under ultraviolet light. 

The region of the gel containing an approximately 250 bp 
product was excised, and a small plug (0.5 mm dia.) was removed 

20 and used to supply template for PCR amplification (40 cycles) . 

The reaction (SO ^1) contained the same components as above, 
minus genomic template ONA. Following amplification, the endr- of 
the fragments were made blunt and were phosphorylated by 
incubation at 25^C for 20 min with 1 unit of T4 DNA polymerase 

25 (NEB), 1 nmol ATP, and 2.15 units of T4 kinase (Pharmacia Biotech 
Inc., Piscataway, NJ) . 

DNA fragments were separated from residual primers by 
electrophoresis through 1% w/v GT(3* agarose (FMC) in TEA. A gel 
slice containing fragments of apparent size 250 bp was excised, 

30 and the DNA was extracted using a Qiaex kit (Qiagen Inc., 
Chatsworth, CA) . 

The extracted DNA fragments were ligated to plasmid vector 
pBC KS(-*>) (Stratagene) that had been digested to completion with 
restriction enzyme Sma 1 and extracted in a manner similar to 

35 that described for pWElS DNA above. A typical ligation reaction 
(16.3 m) contained 100 ng of digested pBC KSi^-) DNA, 70 ng ot 
250 bp fragment DNA, 1 nmol (Co(NH))«]Cli, and 3.9 Weiss units of 
T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), 
in IX ligation buffer (50 mM Tris-HCl. pH 7.4; 10 mM MgCl^; 10 mM 
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dichiochreitol; 1 mM spermidine, 1 mM ATP* 100 mg/ml bovine serum 
albumin) . Following overnight incubation at 14^C* the ligate*.! 
products were transformed into frozen, competent Escherichia :oli 

DH5o cells (Gibco BRL) according to the suppliers' 
S recommendations, and plated on LB-Cam^ plates , containing IPP; 
(119 |ig/ml) and X-gal (50 M.g/ml). Independent white colonies 
were picked, and plasmid DNA was prepared by a modified alkalxue- 
lysis/PEG precipitation method (PRISM™ Ready Reaction DyeDeoxv™ 
Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer). 

10 The nucleotide sequence of both strands o£ the insert DNA was 
determined, using T7 primers [pBC KSi-*.) bases 601-623: 
TAAAACGACGGCCAGTGAGCGCG ) and LacZ primers (pBC KS( + ) bases 792- 
816: ATGACCATGATTACGCCAAGCGCGC ) and protocols supplied with the 
PRISM''^ sequencing kit (ABI/Perkin Elmer) . Nonincorporated dye- 

15 terminator dideoxyribonucleotides were removed by passage through 
Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, 
NJ} according to the manufacturer's instructions. The DNA 
sequence was obtained by analysis of the samples on an ABI Model 
373A DNA Sequencer (ABI/Perkin Elmer). The DNA sequences of two 

20 isolates, G24 and HB14, were found to be as illustrated in Figure 
1. 

This sequence illustrates the following features: 1) bases 
1-20 represent one of the 64 possible sequences of the S4Psh 
degenerate oligonucleotides, ii) the sequence of amino acids 1-3 

25 and 6-12 correspond exactly to that determined for the N-terminus 
of TcaC (disclosed as SEQ ID N0:2), iii) the fourth amino acid 
encoded is a cysteine residue rather than serine. This difference 
is encoded within the degeneracy for the serine codons (see 
above) , iv) the fifth amino acid encoded is proline, 

30 corresponding to the TcaC N-terminal sequence given as SEQ ID 
N0:2, V) bases 257-276 encode one of the 192 possible sequences 
designed into the degenerate pool, vi) the TGA termination codon 
introduced at bases 268-270 is the result of complementarity to 
the degeneracy built into the oligonucleotide pool at the 

35 corresponding posit.^on, and does not indicate a shortened reading 
frame for the corresponding gene. 

Labeling of a TcaC peptide gene -specific probe . DNA 
fragments corresponding to the above 27 6 bases were amplified i35 
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cycles/ by PCR' in a 100 ul reaction volume, using iOO pmol each 
of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HB14 as 
templates, 20 nmol each of dATP. dCTP, dGTP, and dTTP, 5 .units of 
AmpliTAq* DNA polymerase, and IX concentration of GeneAmp* buffer, 
5 under the same temperature regimes as described above. The 

amplification products were extracted from a 1% gtg* agarose gel 
by Qiaex kit and quantitated by fluorometry. 

The extracted amplification products from plasmid HB14 
template (approximately 400 ng) were split into five aliquots and 
labeled with '-p-dCTP using the High Prime Labeling Mix 
<Boehringer Mannheim) according to the manufacturer's 
instructions. Nonincorporated radioisotope was removed by 
passage through NucTrap' Probe Purification Columns (Stratagene) , 
according to the supplier's instructions. The specific activity 
of the labeled DNA product was determined by scintillation 
counting to be 3.11 x 10« dpm/(ig. This labeled DNA was used to 
probe membranes prepared from 800 members of the genomic library. 
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Screening with a Tca c-peptide gene specific probe . The 
radiolabeled HB14 probe was boiled approximately 10 min, then 
added to -minimal hyb- solution. (Note: The -minimal hyb- method 
is taken from a CERES protocol; -Restriction Fragment Length 
Polymorphism Laboratory Manual version 4.0- . sections 4-40 and 4- 
47; CERES/NPI. Salt Lake City. UT. NPl is now defunct, with its 
successors operating as Linkage Genetics). -Minimal hyb- 
solution contains 10% w/v PEG (polyethylene glycol. M.W. approx. 
8000). 7% w/v SDS; 0.6X SSC. 10 mH sodium phosphate buffer (from 
a IM stock containing 95 g/1 NaH,P04«lH,o and 84.5 g/1 
Na,HP04.7H,0), 5 mM EDTA, and 100 mg/ml denatured salmon sperm 
DNA. Membranes were blotted dry briefly then, without 
prehybridization, 5 strips of membrane were placed in each of 2 
plastic boxes containing 75 ml of -minimal hyb- and 2.6 ng/ml of 
radiolabeled HB14 probe. These were incubated overnight with 
slow shaking (50 rpm) at 60°C. The filters were washed three 
times for approximately 10 min each at 25»C in "minimal hyb wash 
solution- (0.25X SSC, 0.2% SDS), followed by two 30-min washes 
with slow shaking at eO'C in the same solution. The filters were 
-placed on paper covered with Saran Wrap* (Dow Brands. 
Indianapolis. IN) in a light-tight autoradiographic cassette and 
exposed to X-Omat X-ray film (Kodak. Rochester. NV) with two 
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DuPonc Cronsx Lighcning-Plus ci enhancers (Sigma Chemical Cc, 
■SC. Louis. MO), for 4 hr at -70oc. Upon development (standard 
photographic procedures), significant signals were evident in 
both replicates amongst a high bacicground of weaker, more 
5 irregular signals. The filters were again washed for about 4 hr 
at SB^C in "minimal hyb wash solution" and then placed again in 
the cassettes and film was exposed overnight at -70<»C. Twelve 
possible positives were identified due to strong signals on both 
of the duplicate 96-well colony impressions. No signal was seen 
lU with negative control membranes (colonies of XLl Blue MR cells 
containing pWElS). and a very strong signal was seen with 
positive control membranes (DH5o cells containing the GZ4 isolate 
of the PGR product) that had been processed concurrently with the 
experimental samples. 

The twelve putative hybridization-positive colonies were 
retrieved from the frozen 96-weil library plates and grown 
overnight at 37*0 on solid LB-Amp.co medium. They were then 
patched (3 /plate, plus three negative controls: XLl Blue MR cells 
containing the pWEiS vector) onto solid LB-Amp.o,. Two sets of 
membranes (Magna NT nylon. 0.45 micron) were prepared for 
hybridization. The first set was prepared by placing a filter 
directly onto the colonies on a patch plate, then removing it 
with adherent bacterial cells, and processing as below. Filters 
of the second sec were placed on plates containing LB-Ampio, 
medium, then inoculated by transferring cells from the patch 
plates onto the filters. After overnight growth at 37»c, the 
filters were removed from the plates and processed. 

Bacterial cells on the filters were lysed and DMA denatured 
by placing each filter colony-side-up on a pool (1.0 ml) of 0.5 n 
NaOH in a plastic plate for 3 min. The filters were blotted dry 
on a paper towel, then the process was repeated with fresh 0.5 n 
NaOH. After blotting dry. the filters were neutralized by 
placing each on a 1.0 ml pool of 1 M Tris-HCl. pH 7.5 for 3 min, 
blotted dry, and reneutralised with fresh buffer. This was 
followed by two similar soalcings (5 min each) on pools of 0.5 M 
Tris-HCl pH 7.5 plus 1.5 M NaCl . After blotting dry. the DNA was 
UV crosslinked to the filter (as above) . and the filters were 
washed (25«'C. .100 rpm) in about 100 ml of 3X SSC plus 0.1i(w/v) 
SOS (4 times. 30 min each with fresh solution for each wash). 
They were then placed in a minimal volume of prehybridization 
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20 



solucion (5X S3C plus 1% w.v each of FicoU 400 (Pharmacia: 
polyvinylpyrrolidone (av. m.W. 360.000; Sigma . and bovine serum 
albumin Fraction V; (Sigma)) for 2 hr ac es'c, 50 rpm. The 
prehybridizacion solution was removed, and replaced with the HB14 
'•P-labeled probe that had been saved from the previous 
hybridization of the library membranes and which had been 
denatured at 950C for 5 min. Hybridization was performed at 60oc 
for 16 hr with shaking at 50 rpm. 

Following removal of the labeled probe solution, the 
membranes were washed 3 times at 250C (50 rpm. 15 min) in 3x ssc 
(about 150 ml each wash) . They were then washed for 3 hr at eS'C 
<50 rpm) in 0.25X SSC plus 0.2% SDS (minimal hyb wash solution) 
and exposed to X-ray film as described above for 1.5 hr at 250c 
(no enhancer screens). This exposure revealed very strong 
hybridization signals to cosmid isolates 22G12, 25A10, 26A5. and 
26B10. and a very wea)c signal with cosmid isolate 8BI0. no 
signal was seen with the negative control (pWElS) colonies, and a 
very strong signal was seen with positive control membranes (DH5a 
cells containing the GZ4 isolate of the PCR product) that had 
been processed concurrently with the experimental samples. 
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35 



Amplification of a specific gen omic fraomenr. of a cr^B ^ php 
Based on the N-terminal amino acid sequence determined for the 
purified TcaBj peptide fraction (disclosed here as SEQ ID r«):3) a 
pool of degenerate oligonucleotides (pool P8F) was synthesized as 
described for peptide TcaC. The determined amino acid sequence 
and the corresponding degenerate DNA sequence are given below, 
where A, c. g. and T are the standard DNA bases, and I represents 
inosine: 



Acid L«u 



Phe Thr Gin 



Thr Lau 



Lys Clu Ala Arg 



(C/T)TI '^'^^ ACI (C/T)TI AAA CAA GCI (A/C)G3' 



Another set of degenerate oligonucleotides was synthesized 
(pool P8.108.3R). representing the complement of the coding 
strand for the determined amino acid sequence of the TcaBi-PTlOS 
40 internal peptide (disclosed herein as SEQ ID NO:20): 



Amino 
Acid 



M«t Tyr 



Tyr 



lie 



Gin 



Ala 



Gin 



Gin 



-53- 



Pi loerm mr ^ueer 101 11 c oex 



WU 97/17432 PCT/U&»y6/1KUUJ 

Codons ATG TA(T/C) TA(T/C) ATtT/C/A) CA<A/G) GC(A/C/G'TJ CA(A/C CAiA'C. 
Ti?. L-.-H.2P. 3' AT(A/G) AT(A/G) TA{A/G/T) CT{T/C) CGI GT(T/C) CT 5* 

TAC 

These oligonucleotides were used as primers for PGR using 
HotStarc 50 Tubes^ (Molecular Bio-Products, Inc.. San Diego, CA) 
to amplify a specific DNA fragment from genomic DNA prepared from 
Phocorhabdus strain W-14 (see above). A typical reaction (50 |il; 

10 contained (bottom layer) 25 pmol of each primer pool P8F and 

P8.108.3R, with 2 nmol each of dATP. dCTP. dGTP. and dTTP, in IX 
GeneAmp* PGR buffer, and (top layer) 230 ng of genomic template 
DNA. 8 nmol each of dATP. dCTP, dGTP, and dTTP. and 2.5 units of 
AmpliTaq"* DNA polymerase, in IX GeneAmp* PGR buffer. 

15 Amplifications were performed by 35 cycles as described for the 
TcaG peptide. Amplification products were analyzed by 
electrophoresis through 0.7% w/v SeaKem' LE agarose (FMG) in TEA 
buffer. A specific product of estimated size 1600 bp was 
observed. 

20 Four such reactions were pooled, and the amplified DNA was 

extracted from a 1.0% SeaKem^ LE gel by Qiaex kit as described for 
the TcaG peptide. The extracted DNA was used directly as the 
template for sequence determination (PRISM Sequencing Kit) using, 
the P8F and P8.108.3R primer pools. Each reaction contained 

25 about 100 ng template DNA and 25 pmol of one primer pool, and was 
processed according to standard protocols as described for the 
TcaG peptide. An analysis of the sequence derived from extension 
of the P8F primers revealed the short DNA sequence (and encoded 
amino acid sequence) : 

30 GAT GCA TTG NTT GGT 

Asp Ala Leu (Val) Ala 
which corresponds to a portion of the N-terminal peptide sequence 
disclosed as SEQ ID N0:3 (TcaBi) . 

35 Labeling of a TcaB i -oeptide gene-specif ic probe . 

Approximately 50 ng of gel-purified TcaBi DNA fragment v/as 
labeled with ^*P-dGTP as described above, and nonincorporated 
radioisotopes were removed by passage through a NICK Column* 
(Pharmacia). The specific activity of the labelled DNA was 

40 determined to be 6 x lo"* dpm/jig. This labeled DNA was used to 
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probe colony membranes prepared from members of the genomic 
library chat had hybridized to the Tcac-peptide specific probe. 

The membranes containing the 12 colonies identified in the 
TcaC -probe library screen (see above) were stripped of 
5 radioactive Tcac-specif ic label by boiling twice for 

approximately 30 min each time in 1 liter of O.ix SSC plus o.l % 
SDS. Removal of radiolabel was checked with a 6 hr film 
exposure. The stripped membranes were then incubated with the 
TcaBi peptide-specific probe prepared above. The labeled DNA was 
denatured by boiling for 10 min. and then added to the filters 
that had been incubated for 1 hr in 100 ml of "minimal hyb- 
solution at 60«C. After overnight hybridization at this 
temperature, the probe solution was removed, and the filters were 
washed as follows UU in 0.3X SSC plus 0.1% SDS) : once for 5 min 
at 25°C, once for 1 hr at 60«»C in fresh solution, and once for 1 
hr at 63 oc in fresh solution. After 1.5 hr exposure to x-ray 
film by standard procedures. 4 strongly-hybridizing colonies were 
observed. These were, as with the TcaC-specif ic probe, isolates 
22G12, 25A10. 26A5, and 26B10. 

The same TcaBi probe solution was diluted with an equal 
volume (about 100 ml) of -minimal hyb- solution, and then used to 
screen the membranes containing the 800 members of the genomic 
library. After hybridization, washing, and exposure to X-ray 
film as described above, only the four cosmid clones 22G12. 

25A10. 26A5. and 26B10, were found to hybridize- strongly to this 
probe . 



15 



20 



25 



30 



35 



40 



ISOLATIOtl OF gUBCLOMES C ONTAINING GEHES ENCODING TcaC MID 
TcaBi PEPTIDES. AND DETERMIN ATION OF DMA BASE SEQUENCE THEPEOF - 

Three hybridization-positive cosmids in strain XLl Blue MR were 
grown with shaking overnight (200 rpm) at 30°C in 100 ml tb- 
Amp„o. After harvesting the cells by centrifugation. cosmid DNA 
was prepared using a commercially available kit (BIGprep"*, 5 
Prime 3 Prime. Inc.. Boulder. CO), following the manufacturer • s 
protocols. Only one cosmid. 26A5. was successfully isolated by 
this procedure. When digested with restriction enzyme EcoR 1 
(NEB) and analyzed by gel electrophoresis, fragments of 
approximate sizes 14, 10, 8 (vector), 5. 3.3. 2.9. and 1.5 kbp 
were detected, a second attempt to isolate cosmid DNA from the 
same three strains (8 ml cultures; TB-Arap,„„. 30oc) utilized a 
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c:?iiing miniprep mechcd lEvans C. and G. Wahl.. 1987, -Cosmid 
vectors for genomic vaiking and rapid rescriccion mapping." in 
Guide to Molecular Cloning Techniques. iMeth, Enzymoloov , vol. 
152. S. Berger and A. Kimmel. eds., pgs. 604-610). Only one 
5 cosmid, 25A10, was successfully isolated by this method. When 
digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel 
electrophoresis, this cosmid showed a fragmentation pattern 
identical to that previously seen with cosmid 26A5. 

A 0.15 ^ig sample of 26A5 cosmid DNA was used to transform 50 
ml of £. coll DHSa cells (Gibco BRL) . by the supplier's 
protocols. A single colony isolate of that strain was inoculated 
into 4 ml of TB-Ampi)u, and grown for 8 hr at 37oc, 

Chloramphenicol was added to a final concentration of 225 jig/ml. 
incubation was continued for another 24 hr. then cells were 
harvested by centrif ugation and frozen at -20°C. Isolation of 
the 26A5 cosmid DNA was by a standard alkaline lysis miniprep 
IManiatis et ai., op. cic, p. 382). modified by increasing all 
volumes by 50% and with stirring or gentle mixing, rather than 
vortexing. at every step. After washing the DNA pellet in 70% 
ethanoi. it was dissolved in TE containing 25 ng/ml ribonuclease 
A (Boehringer Mannheim) . 
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25 



Identif ication of EcoR 1 fragments hybridizing to GZ4- 
derived and TcaB i - probes . Approximately 0.4 m of cosmid 25A10 
(from XLl Blue MR cells) and about 0.5 jig of cosmid 26A5 (from 
chloramphenicol-amplif ied DH5a cells) were each digested with 
about 15 units of EcoR 1 (NEB) for 85 min. frozen overnight, then 
heated at 65®C for five min. and electrophoresed in a 0.7% 
agarose gel (Seakem* LE. IX TEA. 80 volts. 90 min). The DNA v;as 
30 stained with ethidium bromide as described above, and 

photographed under ultraviolet light. The EcoR 1 digest of 
cosmid 25A10 was a complete digestion, but the sample of cosmid 
26A5 was only partially digested under these conditions. The 
agarose gel containing the DNA fragments was subjected to 
depurination. denaturation and neutralization, followed by 
Southern blotting onto a Magna NT nylon membrane, using a high 
salt (20X SSC) protocol, all as described in section 2.9 of 
Ausubel ec al. (CPMB. op. cic). The transferred DNA was then 
UV-crossl inked to the nylon membrane as before. 
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An TcaC -peptide specific DMA fragment corresponding ro the 
insert of piasmid isolate GZ;4 was anplified by PGR* in a 100 ml 
reaction voliame as described previously above. The ampiif icat ior 
products from three such reactions were pooled and were extracted 
5 from a 1% GTG* agarose gel by Qiaex kit, as described above, and 
quantitated by fluorometry. The gel-purified DNA (100 ng) was 
labeled with '-p-dCTP using the High Prime Labeling Mix 
(Boehringer Mannheim) as described above, to a specific activity 
of 6.34 X 10' dpm/ng. 
10 The *-p-labeled GZ4 probe was boiled 10 min, then added to 

"minimal hyb" buffer (at 1 ng/ml), and the Southern blot membrane 
containing the digested cosmid DNA fragments was added, and 
incubated for 4 hr at 60^C with gentle shaking at 50 rpm. The 
membrane was then washed 3 times at 25^C for about 5 min each 
15 (minimal hyb wash solution) . followed by two washes for SO min 
each at 60°C. The blot was exposed to film iwith enhancer 
screens) for about 30 min at -lO^C. The G24 probe hybridized 
strongly to the 5.0 kbp (apparent size) EcoR 1 fragment -yt both 
these two cosmids, 26A5 and 25A10. 
20 The membrane was stripped of radioactivity by boiling for 

about 3 0 min in O.IX 5SC plus 0.1 % SDS, and absence of 
radiolabel was checked by exposure to film. It was then 
hybridized at 60^G for 3.5 hours with the (denatured; TcaBi probe 

in "minimal hyb" buffer previously used for screening the colony 
25 membranes (above), washed as described previously, and exposed to 
film for 40 min at -70^C with two enhancer screens. With both 
cosmids, the TcaBi probe hybridized lightly with the about 5.0 

kbp EcoR 1 fragment, and strongly with a fragment of 
approximately 2.9 kbp. 
30 The sample of cosmid 26A5 DNA previously described, (from 

DK5a cells) was used as the source of DNA from which to subclone 
the bands of interest. This DNA (2.5 fig) was digested with about 

3 units of EcoR 1 (NEB) in a total volume of 30 \il for 1.5 hr, co 
give a partial digest, as confirmed by gel electrophoresis. Ten 
35 Kg of pBC KS (♦) DNA (Stratagene) were digested for 1 . 5 hr with 

20 units of EcoR 1 in a total volume of 20 leading to total 
digestion as confirmed by electrophoresis. Both EcoR i-cut DNA 

preparations were diluted to 50 ill with water, to each an equal 
volume of PCI was added, the suspension was gently mixed, spun in 
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a microcentrifuge and the aqueous supernatant was collected. li;a 
was precipitated by 150 ^1 ethanol, and the mixture was placed az 
-20^C overnight. Following centrif ugation and drying, the EcoR 
1-digested pBC KS ( + ) was dissolved in 100 jil TE; the partially 

5 digested 26A5 was dissolved in 20 \kl TE. DNA recovery was 
checked by fluorometry. 

In separate reactions, approximately 60 ng of EcoR 1- 
digested pBC KS(>I DNA was ligated with approximately 180 ng or 
270 ng of partially digested cosmid 2 6A5 DNA. Ligations were 
carried out in a volume of 20 ^il at IS^C for 5 hr, using T4 
ligase and buffer from New England BioLabs. The ligation 
mixture, diluted to 100 \ii with sterile TE, was used to transform 
frozen, competent DH5a cells (Gibco BRL) according to the 
supplier's instructions, varying amounts (25-200 of the 
transformed cells were plated on freshly prepared solid LB-Cam.:s 
medium with 1 mM IPTG and 50 mg/1 X-gal. Plates were incubated 
at 370c about 20 hr, then chilled in the dark for approximately 3 
hr to intensify color for insert selection. White colonies were 
picked onto patch plates of the same composition and incubated 
20 overnight at 31 ^C. 

Two colony lifts of each of the selected patch plates were 
prepared as follows. After picking white colonies to fresh 
plates, round Magna NT nylon membranes were pressed onto the 
patch plates, the membrane was lifted off, and subjected to 
denaturation, neutralization and UV crosslinking as described 
above for the library colony membranes. The crosslinked colony 
lifts were vigorously washed, including gently wiping off the 
excess cell debris with a tissue. One set was hybridized with 
the GZ4(TcaC) probe solution described earlier, and the other set 
was hybridized with the TcaBi probe solution described earlier, 
according to the 'minimal hyb* protocol, followed by washing and 
film exposure as described for the library colony membranes. 

Colonies showing hybridization signals either only v/ith the 
GZ4 probe, with both G24 and TcaBi probes, or only with the TcaBi 
probe, were selected for further work and cells were streaked for 
single colony isolation onto LBrCamis media with IPTG and X-gal as 
before. Approximately 35 single colonies, from 16 different 
isolates, were picked into licjuid LB-Camis media and grown 
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overnight ac 3T''C; the cells were collected by centrifugation an-J 
piasmid DNA was isolated by a standard alkaline lysis miniprep 
according to Maniatis ec al. (op. cic. p. 368). DNA pellets were 
dissolved in TE * 25 »ig/ml ribonuclease A and DNA concentration 
5 was determined by fluorometry. The EcoR 1 digestion pattern was 
analyzed by gel electrophoresis. The following isolates were 
picked as useful. Isolate A17.2 contains reiigated pBC KS(*) 
only and was used for a (negative) control. Isolates D38.3 and 
C44.1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR I 
10 fragment inserted into pBC KS(*) . These plasmids. named pDAB2000 
and PDAB2001. respectively, are illustrated in Fig. 2. 

Isolate A35.3 contains only the approximately 5 kbp, GZ4)- 
hybridizing EcoR 1 fragment, inserted into pBC KS(+). This 
piasmid was named pDAB2002 (also Fig. 2). These isolates 
15 provided templates for DNA sequencing. 

Plasmids pDAB2000 and pDAB2001 were prepared using the 
BIGprep"' kit as before. Cultures (30 ml> were grown overnight in 
TB-Camj5 to an OD«oo of 2, then piasmid was isolated according to 
the manufacturer's directions. DNA pellets were redissolved in 
100 ul TE each, and sample integrity was checked by EcoR 1 
digestion and gel electrophoretic analysis. 

Sequencing reactions were run in duplicate, with one 
replicate using as template pDAB2000 DNA, and the other replicate 
using as template pDAB2001 DNA. The reactions were carried out 
25 using the dideoxy dye terminator cycle sequencing method, as 

described above for the sequencing of the GZ4/HB14 DNAs. Initial 
sequencing runs utilized as primers the LacZ and T7 primers 
described above, plus primers based on the determined sequence of 
the TcaBj PCR air^lif ication product (THl s 

ATTGCAGACTGCCAATCGCTTCGG. TH12 = GAGAGTATCCAGACCGCGGATGATCTG ) . 

After alignment and editing of each sequencing output, each 
was truncated to between 250 to 350 bases, depending on the 
integrity of the chromatographic data as interpreted by the 
Perkin Elmer Applied Biosystems Division SeqEd 675 software. 
35 Subsequent sequencing "steps" were made by selecting appropriate 
sequence for new primers. With a few exceptions, primers 
(synthesized as described above; were 24 bases in length with a 
50% GfC composition. Sequencing by this method was carried out 
on both strands of the approximately 2.9 kbp EcoR 1 fragment. 



20 



30 



-59- 



Cl IDPTiii prr «i iri-T /ni n r net 



wo 97/1 7432 PCT/US96/18003 

To further serve as tempiace tor DMA sequencing, piasmid DllA 
from isolate pDAB2002 was prepared by BIGprep"^ kit. Sequencing 
reactions were performed and analyzed as described above. 
Initially, a T3 primer (pBS SK (*) bases 774-796: 
5 CQCGCAATTAACCCTCACTAAAG ) and a T7 primer (pBS KS (+) bases 621- 
643: GCGCGTAATACGACTCACTATAG) were used to prime the sequencing 
reactions from the flanking vector sequences, reading into the 
insert DNA. Another set of primers, (GZ4F: 

GTATCGATTACAACGCTGTCACTTCCC ; TH 1 3 : GGGAAGTGAC AGCGTTGTAATCGATAC ; 

10 TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC ; and LWl-204: 

GGGAAGTGACAGCGTTGTAATCGATAC) was made to prime from internal 
sequences, which were determined previously by degenerate 
oligonucleotide-mediated sequencing of subcloned TcaC-peptide PGR 
products. From the data generated during the initial rounds of 

15 sequencing, new sets of primers were designed and used to walk 
the entire length of the --5 kbp fragment. A total of 55 oligo 
primers was used, enabling the identification of 4832 total bp of 
cont iguous sequence . 

When the DNA sequence of the EcoR 1 fragment insert of 
20 PDAB2002 is combined with part of the determined seq[uence of the 
pDAB2000/pDAB2001 isolates, a total contiguous seqpience of 6005 
bp was generated (disclosed herein as SEQ ID NO:25) . When long 
open reading frames were translated into the corresponding amino 
acids, the seqfuence clearly shows the TcaBi N-terminal peptide 

25 (disclosed as SEQ ID N0:3), encoded by bases 19-75, immediately 
following a methionine residue (start of translation) . Upstream 
lies a potential ribosome binding site (bases 1-9), and 
downstream, at bases 166-228 is encoded the TcaBi-PT158 internal 

peptide (disclosed herein as SEQ ID N0:19). Further downstream, 
30 in the same reading frame, at bases 1738-1773, exists a sequence 
encoding the TcaBi-PTlOS internal peptide (disclosed herein as 

SEQ ID NO:20). Also in the same reading frame, at bases 1897- 
1923. is encoded the TcaBii N-terminal peptide (disclosed herein 
as SEQ ID N0:5). and the reading frame continues uninterrupted tc 
35 a translation termination codon at nucleotides 3586-3588. 

The lack of an in- frame stop codon between the end of the 
sequence encoding TcaB, -PT108 and the start of the TcaBii encoding 

region, and the lack of a discernible ribosome binding site 
immediately upstream of the TcaBii coding region, indicate that 
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peptides TcaBii and TcaBi are encoded by a single open reading 
frame of 3567 bp beginning at base pair 16 in SEQ ID NO:25), and 
are most likely derived from a single primary gene produce ot 
1189 amino acids (131,586 Daltons; disclosed herein as SEQ id 
5 NO: 26) by posc-translational cleavage. If the amino acid 

immediately preceding the TcaBii N-terminal peptide represents 
the C-terminal amino acid of peptide TcaBi, then the predicted 
mass of TcaBii (627 amino acids) is 70.814 Daltons (disclosed 
herein as SEQ ID NO:28) , somewhat higher than the size observed 
by SDS-PAGE (68 kDa) . This peptide would be encoded by a 
contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID 
NO:27). It is thought that the native C-terminus of TcaBi lies 
somewhat closer to the C-terminus of TcaBi-PT108. The molecular 
mass of PT108 [3.438 kDa; determined during N-terminal amino acid 
secjuence analysis of this peptide) predicts a size of 30 amino 
acids. Using the size of this peptide to designate the c- 
terminus of the TcaBi coding region (Glu at position 604 of SEQ 
ID NO:28J, the derived size of TcaBi is determined to be 604 
amino acids or 68,463 Daltons, more in agreement with 
20 experimental observations. 

Translation of the TcaBii peptide coding region of 1686 base 
pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 
amino acids (disclosed herein as SEQ ID NO: 30) with predicted 
mass of 60,789 Daltons, which corresponds well with the observed 
25 61 kDa. 

A potential ribosome binding site (bases 3633-3638) is found 
48 bp downstream of the stop codon for the ccaB open reading 
frame. At bases 3645-3677 is found a sequence encoding the N- 
terminus of peptide TcaC, (disclosed as SEQ ID NO, 2). The open 
reading frame initiated by this N-terminal peptide continues 
uninterrupted to base 6005 (2361 base pairs, disclosed herein as 
the first 2361 base pairs of SEQ ID NO. 31). A gene iccaC) 
encoding the entire TcaC peptide, (apparent size -165 kDa; -1500 
amino acids), would comprise about 4500 bp. 

Another isolate containing cloned EcoR 1 fragments of cosmid 
26A5. E20.6, was also identified by its homology to the 
previously mentioned G24 and TcaBi probes. Agarose gel analysis 
of EcoR 1 digests of the DNA of the plasmid harbored by this 
strain (pDAB2004, Fig. 2), revealed insert fragments of estimated 
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sizes 2.?, 5, and 3.3 kbp. DNA sequence analysis initi-ited trcm 
primers designed from che sequence of piasmid pDAB2002 revealed 
that the 3.3 kbp EcoR i fragment of pDAB2004 lies adjacent to the 
5 kbp EcoR 1 fragment represented in pDAB2002. The 2361 base 
5 pair open reading frame discovered in pDAB2002 continues 
uninterrupted for another 2094 bases in pDAB2004 [disclosed 
herein as base pairs 2362 to 4458 of SEQ ID N0:31). DNA sequence 
analysis using the parent cosmid 26A5 DNA as template confirmed 
the continuity of the open reading frame. Altogether, the open 
10 reading frame {TcaC SEQ ID N0:31) comprises 4455 base pairs, and 
encodes a protein (TcaC) of 1485 amino acids (disclosed herein as 
SEQ ID NO:32J. The calculated molecular size of 166,214 Daltons 
is consistent with the estimated size of the TcaC peptide (165 
kDa), and the derived amino acid sequence matches exactly that 
disclosed for the TcaC N-terminal sequence ISEQ ID N0:2]. 

The lack of an amino acid sequence corresponding to SEQ ID 
NO: 17; used to design the degenerate oligonucleotide primer pool 
in the discovered sequence indicates that the generation of the 
PCR® products found in isolates G24 and HB14, which were used as 
probes in the initial library screen, were fortuitously generated 
by reverse-strand priming by one of the primers in the degenerate 
pool. Further, the derived protein sequence does not include the 
internal fragment disclosed herein as SEQ ID NO: 18. These 
sequences reveal that piasmid pDAB2004 contains the complete 
25 coding region for the TcaC peptide. 

Example 9 

Screening of the Phocorhabdus genomic library 
for genes encoding the TcbA jj peptide 

This example describes a method used to identify DNA clones 
that contain the TcbAn peptide-encoding genes, the isolation of 
the gene, and the determination of its partial DNA base sequence. 

■^5 Primers and PCR reactions 

The TcbAii polypeptide of the insect active preparation is 
-206 kDa. The amino acid sequence of the N-terminus of this peptide 
is disclosed as SEQ ID N0:1. Four pools of degenerate 
oligonucleotide primers I "Forward primers" : TH-4. TH-5, TH-6. and 
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TH-Ti were synchesized to encode a portion of this amino a;id 
sequence, as described in Example 8. and are shown below. 



Table 11 



lU 



IS 



.^mino 












Acid 


Ph« 


He 


Gin 


Gly 


Tyr 


TH-4 


5'-TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


TH-5 


5' -TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


TH-6 


5' -TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


TH-7 


5* -TT(T/C) 


ATI 


CA(A/G» 


GGI 


TA(T/C> 




In addition, a 


primary 


("a 


•) and 



Ser Asp Leu 

TCI GA(T/C) CTI 

AC(T/C) GA(T/C) CTI 

TCI GA(T/C) rr{A/G) TT-3 

AG(T/C) GA(T/C) TT{A/G) Tr-3 



Phe 
TT-3 • 
TT-3 • 



Of an internal peptide preparation (TcbAii-PT81) have been 
determined and are disclosed herein as SEQ ID No: 23 and SEQ ID 
No:24, respectively. Four pools of degenerate oligonucleotides 
(-Reverse Primers-: TH-S. TH-9, TH-iO and TH-U) were similarly 
designed and synthesized to encode the reverse complement of 
sequences that encode a portion of the peptide of SEQ ID NO: 23, 
as shown below. 
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Sees of these primers were used in PCR' reactions to ampii.y 
TcbAii- encoding gene fragments from the genomic Phocorhabau:::- 
luminescens W-14 DMA prepared in Example 6. Ail PGR" reactions 
were run with the "Hot Start- technique using AmpiiWax^ gems an«l 
5 other Perkin Elmer reagents and protocols. Typically, a mixture 
(total volume 11 of MgCl>, dNTP's, iOX GeneAmp' PGR Buffer II. 
and the primers were added to tubes containing a single wax beaj. 
(lOX GeneAmp' PGR Buffer II is composed of 100 mM Tris-HCl, pH 
8.3; and 500 mM KGl.J The tubes were heated to SO^g for 2 

10 minutes and allowed to cool. To the top of the wax seals, a 

solution containing lOX GeneAmp' PGR Buffer II, DNA template, mvI 
AmpliTaq^ DNA polymerase were added. Following melting of the wax 
seal and mixing of components by thermal cycling, final reaction 
conditions (volume of 50 iLl) were: 10 mM Tris-HCl, pH 8 . 3 ; 50 mil 

15 KGl; 2.5 mM MgCl2; 200 JIM each in dATP, dGTP, dGTP, dTTP; 1.25 mM 

in a single Forward primer pool; 1.2 5 jiM in a single Reverse 
primer pool, 1.25 units of AmpliTaq* DNA polymerase, and 170 nj of 
template DNA. 

The reactions were placed in a thermocycler (as in 
20 Example 8) and run with the following program: 



Table 13 



Tampcrature Time Cycia 
Rapatition 

94«C 2 minutes IX 



940c 


15 seconds 




55-650C 


30 seconds 


30X 


720C 


1 minute 






72°C 


7 minutes 


IX 



150c 



Constant 
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A series o£ amplifications was run at three different 
annealing temperatures (55^, 60^. 65° C) using the degenerate 
S primer pools. Reactions with annealing at 65°C had no 
amplification products visible following agarose gel 
electrophoresis. Reactions having a 60 °C annealing regime and 
containing primers TH-5+TH-10 produced an amplification product 
that had a mobility corresponding to 2.9 kbp. A lesser amount of 

lU the 2,9 kbp product was produced under these conditions with 

primers TH-7+TH-10. When reactions were annealed at 55«*C, these 
primer pairs produced more of the 2.9 kbp product, and this 
product was also produced by primer pairs TH-5+TH-8 and TH-5+TH- 
11. Additional very faint 2.9 kbp bands were seen in lanes 

15 containing amplification products from primer pairs TH-7 plus TH- 
8, TH-9, TH-10, or TH-11. 

To obtain sufficient PCR amplification product for cloning 
and DNA sequence determination, 10 separate PCR reactions were 
set up using the primers TH-5+TH-iO, and were run using the above 
20 conditions with a 55°C annealing temperature. All reactions were 
pooled and the 2.9 kbp product was purified by Qiaex extraction 
from an agarose gel as described above. 

Additional sequences determined for TcbAii internal peptides 

are disclosed herein as SEQ ID N0:21 and SEQ ID NO:22. As 
25 before, degenerate oligonucleotides (Reverse primers TH-17 and 
TH-18) were made corresponding to the reverse complement of 
sequences that encode a portion of the amino acid sequence of 
these peptides. 

30 Table 14 

From SBQ ID NO: 21 

Amino 

Acid Met Glu Thr Gin Asn lie 

35 

TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI 

Table 15 

40 From SBQ ZD NO: 22 

Amino 

Acid Asn Pro lie Asn He Asn Thr Gly lie Asp 

45 TH-18 3'-TT(A/G) GGI TAI TT(A/G) TAI TT<A?G) TGI CCI TAI CT{A/G)-5' 
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Degenerate oligonucleotides TH-13 and TH-17 were used m an 
amplification experiment with Fhocorhaixius luminescens u-u dua 
as template and primers TH-4. th-5. TH-6. or TH-7 as the 5'- 
•Forward) primers. These reactions amplified products of 
5 approximately 4 kbp and 4.5 kbp. respectively. These DMAs were 
transferred from agarose gels to nylon membranes and hybridized 
;vith a -P- labeled probe (as described above) prepared from the 
2.9 kbp product amplified by the TH-5+TH10 primer pair. Both the 
4 kbp and the 4.5 kbp amplification products hybridized strongly 
to the 2.9 kbp probe. These results were used to construct a map 
ordering the TcbAii internal peptide sequences as shown in 

Fig. 3. Approximate distances between the primers are shown in 
nucleotides in Fig. 3. 

DMA sequence of the 2.9 kbn TchA^ , -encodinQ fr^^.nt 

Approximately 200 ng of the purified 2 . 9 kbp fragment 
(prepared above) was precipitated with ethanol and dissolved in 
17 ml of water. One-half of this was used as sequencing template 
with 25 pmol of the TH-5 pool as primers, the other half was used 
as template for TH-10 priming. Sequencing reactions were as 
given in Example 8. No reliable sequence was produced using the 
TH-10 primer pool; however, reactions with TH-5 primer pool 
produced the sequence disclosed below: . 

25 si ^S^l^^ £5^'=°°°^ TCGGTGGAAT CGATCTCCTC ACCGGGGGTt' 

rrl^K??^ f^n^^^^ TGAGCCCAAA AANTGGAATG AAAGAAGTTC AATTTNTTAC 

GTCGCCCGGH TTTAGAAAGN TTArrTGm-CA GCCAGAAAAT rTTOGTrcAG 
GAAATTCCAC CGNTOCTTCT CTCTATTCAT TNGGGCCTCG CCGGGTrCGA ANNaWma 

301 S^J^Sc J^cSS^I f*^'^ Tr«.cNAr.crr m^SSS ^^^'^ 

3() 361 CcSISJS AACTCTCCGG CAAATCGTCC ATCANCCTGA NCCAGGNTTN 



IS 



20 



35 



121 

181 
241 



Based on this sequence, a sequencing primer (TH-21. 5'- 
CCGGGCGACGTTTATCTAGG-3-) was designed to reverse complement bases 
120-139, and initiate polymerization towards the 5' end (i.e., 
TH-5 end) of the gel-purified 2.9 kbp TcbAn-encoding PCR 
fragment. The determined sequence is shown below, and is 
compared to the biochemically determined N-terminal peptide 
sequence of TcbAn SEQ ID NO:l. 



-67- 



CiiDCTmiTc ouecT /Bi It c oe» 



wo 97/17432 PCT/US96/18003 

TcbAjj 2.j kbp PC P. fragment Sequence Contirmar ion 

iC'nderiined amino acids = encoded by degenerate oligonucleotides 

SEQ ID NO:i FIQGYSPLF G - - a 
^ ^ I I I I I I I I I 

2.9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT 

M Q G Y S D L F G N P. A . 



10 



15 



20 



30 



35 



40 



From the homology o£ the derived amino acid sequence to the 
biochemically determined one. it is clear that the 2.9 kbp PCR 
fragment represents the TcbA coding region. This 2.9 kbp 
fragment was then used as a hybridization probe to screen the 
Phocorhabdus w-14 genomic library prepared in Example 8 for 
cosmids containing the TcbAii-encoding gene. 



Screening the Phocorhabdus cosmid libra: 

The 2.9 kb gel -purified PCR fragment was labeled with '"p 
using the Boehringer Mannheim High Prime labeling kit as 
described in Example 8. Filters containing remnants of 
approximately 800 colonies from the cosmid library were screened 
as described previously (Example 8). and positive clones were 
streaked for isolated colonies and rescreened. Three clones 
(SAll, 25G8. and 26D1) gave positive results through several 
screening and characterization steps. No hybridization of the 
25 TcbAii- specific probe was ever observed with any of the four 
cosmids identified in Example 8. and which contain the ccaB and 
ccaC genes. DNA from cosmids SAll. 25G8, and 26D1 was digested 
with restriction enzymes Bgl 2, EcoR 1 or Hind 3 (either alone or 
in combination with one another) . and the fragments were 
separated on an agarose gel and transferred to a nylon membrane 
as described in Exan^le 8. The membrane was hybridized with "p- 
labeled probe prepared from the 4.5 kbp fragment (generated by 
amplification of Phocorhabdus genomic DNA with primers TH-5+TH- 
17). The patterns generated from cosmid DNAs SAll and 26Di were 
identical to those generated with similarly-cut genomic DNA on 
the same membrane. It is concluded that cosmids SAll and 26D1 
are accurate representations of the genomic TcbAji encoding 
locus. However, cosmid 25G8 has a single Bgl 2 fragment which is 
slightly larger than the genomic DNA. This may result from 
positioning of the insert within the vector. 
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DUA 3equence of the ccbA- encoding gene 

The membrane hybridization analysis of cosmid 26D1 revealed 
chat the 4.5 Jcbp probe hybridized to a single large EcoR 1 
fragment (greater than 9 kbp) . This fragment was gel purified 
S and ligated into the EcoR 1 site of pBC KS (•*•) as described in 
Example 8. to generate plasmid p6C-51/Rl. The partial DNA 
sequence of the insert DNA of this plasmid was determined by 
''primer walking" from the flanking vector sequence, using 
procedures described in Example 8. Further sequence was 

10 generated by extension from new oligonucleotides designed from 
the previously determined sequence. When compared to the 
determined DNA sequence for the ccbA gene identified by other 
methods (disclosed herein as SEQ ID NO: 11 as described in Example 
12 below), complete homology was found to nucleotides 1-272, 313- 

15 826. 2578-3036r and 3068-3540 (total bases ^ 17121. It was 
concluded that both approaches can be used to identify DNA 
fragments encoding the TcbAii peptide. 



Analysis of the derived amino acid sequence of the ccbA gene . 
20 The sequence of the DNA fragment identified as SEQ ID NO: 11 

encodes a protein whose derived amino acid sequence is disclosed herein 
as SEQ ID NO: 12. Several features verify the identity of the gene as 
that encoding the TcbAii protein. The TcbAii N-terminal peptide (SEQ 

ID N0:1; Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala) is 
25 encoded as amino acids 88-100. The TcbAii internal peptide TcbAii- 

PT81(a) (SEQ ID NO:23) is encoded as amino acids 1065-1077, and TcbAii- 

PT81(b) (SEQ ID NO;24) is encoded as amino acids 1571-1592. Further, 
the internal peptide TcbAii-PT56 (SEQ ID NO: 22) is encoded as amino 

acids 1474-1488, and the internal peptide TcbAii-PT103 (SEQ ID NO:24) 

30 is encoded as amino acids 1614-1639. It is obvious that this gene is 
an authentic clone encoding the TcbAn peptide as isolated from 

insecticidal protein preparations of Phocorhabdus luminescens strain 
w-i4. 

The protein isolated as peptide TcbAii is derived from cleavage 
35 of a longer peptide. Evidence for this is provided by the fact that 
the nucleotides encoding the TcbAii N-terminal peptide SEQ ID N0:1 are 

preceded by 261 bases (encoding 87 N-terminal -proximal amino acids) of 
a longer open reading frame (SEQ ID NO: 11). This reading frame begins 
v;ith nucleotides that encode the amino acid sequence Met Gin Asn Ser 
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Leu, v.'hxch corresponub cue fj-cerminai sequence ot tne large peptiae* 
TcbA, and is disclosed herein as SEQ ID no:16. It is thoughc chat TcbA 
is Che precursor protein tor TcbAii. 

5 Relationship of ccbA, zca3 and ccaC genes . 

The ccaB and ccaC genes are closely linked and may be 
transcribed as a single mRNA (Example 8). The ccbA gene is borne 
on cosmids that apparently do not overlap the ones harboring the 
ccaB and ccaC c luster « since the respective genomic library 
lU screens identified different cosmids. However, comparison of the 
amino sequences encoded by the ccaB and ccaC genes with the cctA 
gene reveals a substantial degree of homology. The amino acid 

conservation (Protein Alignment Mode of MacVector^ Sequence 
Analysis Software, scoring matrix pam250, hash value s 2: Kodak 

IS Scientific Imaging Systems, Rochester, NY) is shown in Fig. 4. 

On the score line of each panel in Fig. 4, up carats (") indicate 
homology or conservative amino acid changes. and down carats (v) 
indicate nonhomology. 

This analysis shows that the amino acid sequence of the TcbA 

20 peptide from residues 1739 to 1894 is highly homologous to amino 
acids 441 to 603 of the TcaBj^ peptide (162 of the total 627 amino 

acids of P8; SEQ ID NO:28) . In addition, the sequence of TcbA 
amino acids 1932 to 2459 is highly homologous to amino acids 12 
to 531 of peptide TcaBii (^20 of the total 562 amino acids; SEQ 

25 ID 110:30). Considering that the TcbA peptide (SEQ ID N0:12) 

comprises 2505 amino acids, a total of 684 amino acids (27%) at 
the C-proximal end of it is homologous to the TcaBi or TcaBii 

peptides, and the homologies are arranged colinear to the 
arrangement of the putative TcaB preprotein (SEQ ID NO: 26) . A 
30 sizeable gap in the TcbA homology coincides with the junction 
between the TcaBj^ and TcaBii portions of the TcaB preprotein. 

Clearly the TcbA and TcaB gene products are evolutionarily 
related, and it is proposed that they share some common 
function (s) in PhocorhaJbdus. 

35 
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Example 10 



Characterization of zinc-metalioproteases in Fhocorhabdus Broth: 
Protease Inhibition. Classification, and Purification 



assays were performed using FITC-casein dissolved in water as 
substrate (0.08% final assay concentration). Proteolysis 
reactions were performed at 25^C for 1 h in the appropriate 
buffer with 2 5 ul of Phocorhabdus broth (150 ul total reaction 

10 volume) . Samples were also assayed in the presence and absence 
of dithiothreitol . After incubation, an equal volume of 12% 
trichloroacetic acid was added to precipitate undigested protein. 
.Following precipitation for 0.5 h and subsequent centrif ugat ion. 
100 ul of the supernatant was placed into a 96-well microtiter 

IS plate and the pH of the solution was adjusted by addition of an 
equal volume of 4N NaOH. Proteolysis was then qpiantitated using 
a Fluoroskan II fluorometric plate reader at excitation and 
emission wavelengths of 485 and 538 nm, respectively. Protease 
activity was tested over a range from pH 5.0-10.0 in 0.5 units 

20 increments. The following buffers were used at 50 mM final 

concentration: sodium acetate (pH 5.0 - 6.5); Tris-HCL (pH 7.0 - 
8.0); and bis-Tris propane (pH 8.5-10.0). To identify the class 
of protease (s» observed, crude broth was treated with a variety 
of protease inhibitors (0.5 ug/ul final concentration) and then 

25 examined for protease activity at pH 8.0 using the substrate 

described above. The protease inhibitors used included E-64 (L- 
trans-expoxysaccinylleucylamido (4- , -guanidino] -butane) . 3,4 
dichloroisocoumarin, Leupeptin, peps tat in, amastat in, 
ethylenediaminetetraacetic acid (EDTA) and 1,10 phenanthroline. 

30 Protease assays performed over a pH range revealed that 

indeed protease (s) were present which exhibited maximal activity 
at - pH 8.0 (Table 16). Addition of DTT did not have any effect 
on protease activity. Crude broth was then treated with a 
variety of protease inhibitors (Table 17). Treatment of crude 

35 broth with the inhibitors described above revealed that 1.10 
phenanthroline caused complete inhibition of all protease 
activity when added at a final concentration of 50 ug» with the 
IC50 = 5 ug in 100 ul of a 2 mg/ml crude broth solution. These 

data indicate that the most abundant protease (s) found in the 
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rtiocorhabdus broth are from the zinc-mecalioprotease class 
enzymes . 



Table 16 

Effect of pH on the protease activity found in a Day 1 production 

of Phocorhabdus luminescens (strain w-14) . 



pH Flu. Units^ Percent 

^ Activity^ 



5.0 


3013 




78 


17 


* 


TOO A 


z 


A AO 


45 


6.0 


12965 




483 


74 


6.5 


14390 


± 


1291 


82 


7.0 


14386 




1287 


82 


7.5 


14135 




198 


80 


8.0 


17582 




831 


100 


8.5 


16183 




953 


92 


9.0 


16795 




760 


96 


9.5 


16279 


± 


1022 


93 


10.0 


15225 


t 


210 


87 



a Flu. Units = Fluorescence Units (Maximum = -28,000; 
background = - 2200). 

b. Percent activity relative to the maximum at pH 8.0 
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Table 17 

Effect of different protease inhibitors on the protease acri--ir.. 
at pH 8 found in a Day 1 production of Phocorhabdus luminesc^ns 

(strain W-14 J . 



Inhibitor 



Corrected Flu. Units^ Percent Inh ib i t i on^ 



Control 13053 

E-64 14259 

1,10 Phenant hrol ine^ 1 5 

3,4 Dichloroisocoumarin^ 7956 

Leupeptin 13074 

Pepstatin^ 13441 

Amastatin 12474 

DMSO Control 12005 
Methanol Control 



0 
0 

99 

39 
0 

0 
4 
8 

7 



12125 

Corrected Flu. Units = Fluorescence Units 
background (2200 flu. units). 

b. Percent Inhibition relative to protease activity at pH 

8 • 0 • 

c. Inhibitors were dissolved in methanol. 

d. Inhibitors were dissolved in DMSO. 

The isolation of a zinc-metalloprotease was performed by 
applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose 
column equilibrated at 50 mM Na2P04, pH 7.0 as described in 
Example 5 for Phocorhabdus toxin. After extensive washing, a 0 
to 0.5 M NaCl gradient was used to elute toxin protein. The 
majority of biological activity and protein was eluted from 0.15 
- 0.45 M NaCl. However, it was observed that the majority of 
proteolytic activity was present in the 0.25-0.35 M NaCl fraction 
with some activity in the 0.15-0.25 M NaCl fraction. SDS PAGE 
analysis of the 0.25-0.35 M NaCl fraction showed a major peptide 
band of approximately 60 kDa. The 0.15-0.25 M NaCl fraction 
contained a similar 60 kDa band but at lower relative protein 
concentration. Subsequent gel filtration of this fraction using 
a Superose 12 HR 16/50 column resulted in a major peak migrating 
at 57.5 kDa that contained a predominant {> 90% of total stained 
protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis 
of this fraction using various protease inhibitors as described 
above determined that the protease was a zinc-metalloprotease. 
Nearly all of the protease activity present in Phocor/iabdus broth 
at day 1 of fermentation corresponded to the -58 kDa zinc- 
metalloprotease. 

In yet a second isolation of zinc-metalloprotease (s ) . w-l4 
Phocorhabdus broth grown for three days was taken and protease 



accivicy was visualized using sodium dodecyl suifate- 
poiyacrylamide gel electrophoresis ISDS-PAGE) laced with gelatin 
as described in Schmidt, T.M., Bleakley, B. and Nealson, K.M. 
1988. SDS running gels (5.5 x 3 cm) were made with 12.5 % 
5 polyacrylamide (40% stock solution of acrylamide/bis-acr/lamide; 
Sigma Chemical Co.. St. Louis. MO) into which 0.1% gelatin final 
concentration (Biorad EIA grade reagent; Richmond CA) was 
incorporated upon dissolving in water. SDS-stacking gels (1.0 x 
3 cm) were made with 5% polyacrylamide, also laced with 0.1% 
10 gelatin. Typically, 2.5 \ig of protein to be tested was diluted 
in 0.0 3 ml of SDS- PAGE loading buffer without dithiothreitol 
(DTT) and loaded onto the gel. Proteins were electrophoresed in 
SDS running buffer (Laemmli, U.K. 1970. Nature 227, 680) at 0° C 
and at 8 mA. After electrophoresis was con^lete, the gel was 
15 washed for 2 h in 2.5% (v/v) Triton X-100. Gels were then 
incubated for 1 h at 37 °c in 0.1 M glycine (pH 8.0). After 
incubation, gels were fixed and stained overnight with 0.1% amido 
black in methanol-acetic acid- water (30:10:60, vol . /vol . /vol . ; 
Sigma Chemical Co.). Protease activity was visualized as light 
20 areas against a dark, amido black stained background due to 

proteolysis and subsequent diffusion of incorporated gelatin. At 
least three distinct bands produced by proteolytic activity at 
58-, 41-, and 38 kDa were observed. 

Activity assays of the different proteases in w-14 day three 
25 culture broth were performed using FITC-casein dissolved in water 
as substrate (0,02% final assay concentration). Proteolysis 
experiments were performed at 37 ^c for 0-0.5 h in 0 . IM Tris-HCl 
(pH 8.0) with different protein fractions in a total volume of 
0.15 ml. Reactions were terminated by addition of an equal 
30 volume of 12% trichloroacetic acid (TCA) dissolved in water. 
After incubation at room temperature for 0.25 h, samples v;ere 
centrifuged at 10.000 x g for 0.25 h and 0.10 ml aliquots were 
removed and placed into 96-well microtiter plates. The solution 
was then neutralized by the addition of an equal volume of 2 ll 
35 sodium hydroxide, followed by quantitation using a Fluoro.?kan II 
fluorometric plate reader with excitation and emission 
wavelengths of 485 and 538 nm, respectively. Activity 
measurements were performed using FITC-Casein with different * 
protease concentrations at 37o c for 0-10 min. A unit of 
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accivicv was arbitrarily defined as the amount of enzyme needea 
to produce 1000 fluorescent units/min and specific activity was 
defined as unics/mg of protease. 

Inhibition studies were performed using two zinc- 
metailoprotease inhibitors; 1.10 phenanthroline and N-(a- 

rhamnopyranosyloxyhydroxyphosphinyi) -Leu-Trp(phosphoramidon) with 
stock solutions of the inhibitors dissolved in 100% ethanol and 
water, respectively, stock concentrations were typically lo 
mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon. 
respectively, with final concentrations of inhibitor at 0.5-1. o 
mg/ml per reaction. Treatment of three day W.14 crude broth with 
1.10 phenanthroline. an inhibitor of all zinc metalloproteases. 
resulted in complete elimination of all protease activity while 
treatment with phosphoramidon. an inhibitor of thermolysin-like 
proteases (Weaver. L.H.. Kester. W.R.. and Matthews, B.W. 1977. 
J. Mol. Biol. 114. 119-132). resulted in -56% reduction of 
protease activity. The residual proteolytic activity could not 
be further reduced with additional phosphoramidon. 

The proteases of three day W-14 Phoeorhabdus broth were 
purified as follows: 4.0 liters of broth were concentrated using 
an Amicon spiral ultra filtration cartridge -lype SIYIOO attached 
to an Amicon M-12 filtration device. The flow-through material 
having native proteins less than 100 kOa in size (3.8 L) was 
concentrated to 0.375 L using an Amicon spiral ultra filtration 
cartridge Tvpe SIYIO attached to an Amicon M-12 filtration 
device. The retentate material contained proteins ranging in 
size from 10-100 kDa. This material was loaded onto a Pharmacia 
HR16/10 column which had been packed with PerSeptive Biosystem 
(Framington. MA) Poros® 50 HQ strong anion exchange packing that 
had been equilibrated in 10 mM sodium phosphate buffer (pH 7.0). 
Proteins were loaded on the column at a flow rate of 5 mi /rain, 
followed by washing unbound protein with buffer until A28O = 
0.00. Afterwards, proteins were eluted using a NaCl gradient of 
0-1.0 M NaCl in 40 min at a flow rate of 7.5 ml/min. Fractions 
were assayed for protease activity, supra., and active fractions 
were pooled. Proteolytically active fractions were diluted with 
50% (v/v) 10 mM sodium phosphate buffer (pH 7.0) and loaded onto 
a Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium 
phosphate. After washing the column with buffer until A28O = 
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0.00, proteins were eluted using a NaCl gradient of 0-0.5 M tiaCi 
for 1 h at a flow rate of 2.0 ml/min. Fractions were assayed for 
protease activity. Those fractions having the greatest amount of 
phosphoramidon-sensitive protease activity, the phosphoramidon 
5 sensitive activity being due to the 41/38 kDa protease, infra.. 
were pooled. These fractions were found to elute at a range of 
0.15-0.25 M NaCl. Fractions containing a predominance of 
phosphoraraidon-insensitive protease activity, the 58 kDa 
protease, were also pooled. These fractions were found to elute 

10 at a range of 0.25-0.35 M NaCl. The phosphoramidon-sensitive 
protease fractions were then concentrated to a final volume of 
0.75 ml using a Millipore Ultraf ree<5)-15 centrifugal filter device 
Biomax-5K NMWL membrane. This material was applied at a flow 
rate of 0.5 ml/min to a Pharmacia HR 10/30 column that had been 

15 packed with Pharmacia Sephadex G-50 equilibrated in 10 mM sodium 
phosphate buffer (pH 7.0)/ 0.1 M NaCl. Fractions having the 
maximal phosphoramidon-sensitive protease activity were then 
pooled and centrifuged over a Millipore Ultraf ree<l>-15 centrifugal 
filter device Biomax-50K NMWL membrane. Proteolytic activity 

20 analysis, supra., indicated this material to have only 

phosphoramidon-sensitive protease activity. Pooling of the 
phosphoramidon- insensitive protease, the 58 kDa protein, was 
followed by concentrating in a Millipore Ultraf ree<^-15 
centrifugal filter device Biomax-50K NMWL membrane and further 

25 separation on a Pharmacia Superdex-75 column. Fractions 
containing the protease were pooled. 

Analysis of purified 58- and 41/38 kDa purified proteases 
revealed that, while both types of protease were completely 
inhibited with 1.10 phenanthroline. only the 41/38 kOa protease 

30 was inhibited with phosphoramidon. Further analysis of crude 
broth indicated that protease activity of day 1 W-I4 broth has 
23% of the total protease activity due to the 41/38 kDa protease, 
increasing to 44% in day three W-14 broth. 

Standard SDS-PAGE analysis for examining protein purity and 

35 obtaining amino terminal sequence was performed using 4-20% 

gradient MiniPlus SepraGels purchased from integrated Separation 
Systems (Natick. MA) . Proteins to be amino-terminal sequenced 
were blotted onto PVDF membrane following purification, infra.. 
(ProBlott"* Membranes; Applied Biosystems. Foster City, CA) . 
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visualised with 0.1% amido black, excised, and senc co LamiDridge 
Prochem; Cambridge, MA, for sequencing. 

Deduced amino terminal sequence of the 58- (SEQ ID NO: 45) 
and 41/38 kOa (SEQ ID NO: 44) proteases from three day old W-14 
5 broth v/ere DV-GSEKANEKLK (SEQ ID NO: 45) and DSGDDDKVTOTDIHR (SEQ 
ID MO:44), respectively. 

Sequencing of the 41/38 kDa protease revealed several amino 
termini, each one having an additional amino acid removed by 
proteolysis. Examination of the primary, secondary, tertiary and 
10 quart enary sequences for the 38 and 41 kDa polypeptides allowed 
for deduction of the sequence shown above and revealed that these 
two proteases are homologous - 

Example 11, Part A 
'5 Screenin g of Phocorhabdus Genomic Library via use of Antibodies 

for Genes encoding TcbA Peptide 

In parallel to the sequencing described above, suitable 
probing and sequencing was done based on the TcbAn peptide (SEQ 
20 ID N0:1). This sequencing was performed by preparing bacterial 
culture broths and purifying the toxin as described in Examples 1 
and 2 above. 

Genomic DNA was isolated from the Photorhabdus liminescens 
strain W-14 grown in Grace's insect tissue culture medixim. The 

25 bacteria were grown in 5 ml of culture medium in a 250 ml 

Erlenmeyer flask at 2S^C and 250 rpm for approximately 24 hours. 
Bacterial cells from 100 ml of culture medium were pelleted at 
5000 X g for 10 minutes. The supernatant was discarded, and the 
cell pellets then were used for the genomic DNA isolation. 

30 The genomic DNA was isolated using a modification of the 

CTAB method described in Section 2.4.3 of Ausubel {supra.). The 
section entitled "Large Scale CsCl prep of bacterial genomic DMA- 
was followed through step 6. At this point, an additional 
chloroform/ isoamyl alcohol (24:1) extraction was performed 

35 followed by a phenol /chloroform/ isoamyl (25:24:1) extraction step 
and a final chloroform/ isoamyl/alcohol (24:1) extraction. The 
DNA was precipitated by the addition of a 0.6 volume of 
isopropanol. The precipitated DNA was hooked and wound around 
the end of a bent glass rod, dipped briefly into 701 ethanol as a 

40 final wash, and dissolved in 3 ml of TE buffer. 
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The DMA concentration, estimated by optical density a-. 
280/260 nm, was approximately 2 mg/ml. 

Using this genomic DNA. a library was prepared. 
Approximately 50 ug of genomic DNA was partly digested with Saui 
Al. Then NaCl density gradient centrifugation was used to size 
fractionate the partially digested DNA fragments. Fractions 
containing DNA fragments with an average size of 12 kb. or 
larger, as determined by agarose gel electrophoresis, were 
ligated into the plasmid BluScript, Stratagene. La Jolla, 
California, and transformed into an E. coll DH5a or DHBIO strain. 

Separately, purified aliquots of the protein were sent to 
the biotechnology hybridoma center at the University of 
Wisconsin. Madison for production of monoclonal antibodies to the 
proteins. The material that was sent was the HPLC purified 
fraction containing native bands 1 and 2 which had been denatured 
at eS'C. and 20 ug of which was injected into each of four mice. 
Stable monoclonal antibody-producing hybridoma cell lines were 
recovered after spleen cells from unimmunized mouse were fused 
with a stable myeloma cell line. Monoclonal antibodies were 
20 recovered from the hybridomas. 

Separately, polyclonal antibodies were created by taking 
native agarose gel purified band 1 (see Example 1) protein which 
was then used to immunize a New Zealand white rabbit. The 
protein was prepared by excising the band from the native agarose 
geis. briefly heating the gel pieces to 65«c to melt the agarose, 
and immediately emulsifying with adjuvant. Freund's complete 
adjuvant was used for the primary immunizations and Freund's 
incomplete was used for 3 additional injections at monthly 
intervals. For each injection, approximately 0.2 mi of 
emulsified band 1, containing 50 to 100 micrograms of protein, 
was delivered by multiple subcontaneous injections into the back 
of the rabbit. Serum was obtained 10 days after the final 
injection and additional bleeds were performed at weekly 
intervals for 3 weeks. The serum complement was inactivated by 
heating to 56°c for 15 minutes and then stored at -20<»C. 

The monoclonal and polyclonal antibodies were then used to 
screen the genomic library for the expression of antigens which 
could be detected by the epitope. Positive clones were detected 
on nitrocellulose filter colony lifts. An immunoblot analysis of 
Che positive clones was undertaken. 
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An analysis of Che clones as defined by both immunoblor. and 
Souchern analysis resulted in the tentative identification of 
five classes of clones. 

In the first class of clone was a gene encoding the peptide 
5 designated here as TcbAii. Full DNA sequence of this gene iTcbA) 
was obtained. it is set forth as SEQ ID NOsil. Confirmation 
that the sequence encodes the internal sequence of SEQ id W0:l is 
demonstrated by the presence of SEQ ID N0:1 at amino acid number 
88 from the deduced amino acid sequence created by the open 
reading frame of SEQ ID NO: 11. This can be confirmed by 
referring to SEQ ID NO: 12. which is the deduced amino acid 
sequence created by SEQ ID NO: 11. 

The second class of toxin peptides contains the segments 
referred to above as TcaBi, TcaBiiand TcaC. Following the 
screening of the library with the polyclonal antisera, this 
second class of toxin genes was identified by several clones 
which produced different size proteins, all of which cross- 
reacted with the polyclonal antibody on an immunoblot and were 
also found to share DNA homology on a Southern Blot. Sequence 
comparison revealed that they belonged to the gene complex 
designated Tcafl and TcaC above. 

Three other classes of antibody toxin clones were also 
isolated in the polyclonal screen. These classes produced 
proteins that cross-react with a polyclonal antibody and also 
25 shared DNA homology with the classes as determined by Souchern 
blotting. The classes have been designated Class III. class IV 
and Class V. it was also possible to identify monoclonals that 
cross-reacted with Class I. II, iii. and IV. This suggests chac 
all have regions of high protein homology. Thus, it appears that 
Che P. luminescens extracellular protein genes represent a family 
of genes which are evolutionarily related. 

To further pursue the concept chat there might be 
evolutionarily related variations in the toxin peptides contained 
within this organism, two approaches have been undertaken to 
examine other strains of P. luminescens for the presence of 
related proteins. This was done both by PCR amplification of 
genomic DNA and by immunobloc analysis using che polyclonal and 
monoclonal antibodies. 
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The results indicate that related proteins are produced by 
P. luminescens strains wx-2. v;x-3, wx-4, wx-5, ux-S, wx-''. V/X-:.. 
WX-11, WX-12, WX-15 and W-i4. 

Example 11, Part B 
Sequence and anaylsis of Class III toxin clones - ccc 



Further DNA sequencing was performed on plasmids isolated 
from Class III coii clones described in Example 11, Part A. 
10 The nucleotide sequence was shown to be three closely linked open 
reading frames at this genomic locus. This locus was designated 
ccc with the three open reading frames designated cccA SEQ ID 
NO:56. cccB SEQ ID NO:58 and CccC SEQ ID NO:60 (Fig. 6B) . 

The deduced amino acid from the cccA open reading frame 
15 indicates the gene encodes a protein of 105,459 Da, This protein 
was designated TccA, The first 12 amino acids of this protein 
match the N- terminal sequence obtained from a 108 kDa protein. 
SEO ID NO: 7, previously identified as part of the toxin complex. 
The deduced amino acid from the cccB open reading frame 
20 indicates this gene encodes a protein of 175,716 Da. This 

protein was designated TccB. The first 11 amino acids of this 
protein match the N-terminal sequence obtained from a protein 
with estimated molecular weight of 185 kDa, SEQ ID NO: 8. 

The deduced amino acid sequence of cccC indicated that this 
25 open reading frame encodes a protein of 111,694 Da and the 
protein product was designated TccC. 



Example 12 

Characterization of Phocorhabdus Strains 

30 

In order to establish that the collection described herein 
was comprised of Phocorhabdus strains, the strains herein v/ere 
assessed in terms of recognized microbiological traits that are 
characteristic of Phocorhabdus and which differentiate it from 
35 other Encerobacceriaceae and Xenorhabdus spp. (Farmer, J.J. 1984. 
Bergey s Manual of Systemic Bacteriology, vol 1. pp. 510-511. 
(ed. Kreig N.R. and Holt, J.G.). Williams & Wilkins, Baltimore.; 
Akhurst and Boemare, 1988, Boemare et al . , 1993). These 
characteristic traits are as follows: Gram's stain negative 
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rods, organism size of 0.5-2 urn in width and 2-10 um in isngth. 
red/yeliow colony pigment at ion, presence of crystalline inclusion 
bodies, presence of catalase, inability to reduce nitrate, 
presence of biolurainescence. ability to take up dye from growth 
media, positive for protease production, growth-temperature range 
below 370c, survival under anaerobic conditions and positively 
motile. (Table 18). Reference Escherichia coli. Xenorhabdus and 
Phocorhabdus strains were included in all tests for comparison. 
The overall results are consistent with all strains being part of 
the family Encerobacceriaceae and the genus Phocorhabdus. 

A luminometer was used to establish the bioluminescence of 
each strain and provide a quantitative and relative measurement 
of light production. For measurement of relative light emitting 
units, the broths from each strain (cells and media) were 
measured at three time intervals after inoculation in liquid 
culture (6. 12, and 24 hr) and compared to bac)cground luminosity 
(uninoculated media and water) . Prior to measuring light 
emission from the various broths, cell density was established by 
measuring light absorbance (560 nM) in a Gilford Systems 
(Oberlin, OH) spectrophotometer using a sipper cell. Appropriate 
dilutions were then made (to normalize optical density to 1.0 
unit) before measuring luminosity. Aliquots of the diluted 
broths were then placed into cuvettes (300 ul each) and read in a 
Bio-Orbit 1251 Luminometer (Bio-Orbit Oy. Twi)cu, Finland). The 
integration period for each sample was 45 seconds. The samples 
were continuously mixed (spun in baffled cuvettes) while being 
read to provide oxygen availability. A positive test was 
determined as being S 5-fold bac)tground luminescence (-5-10 
units) . In addition, colony luminosity was detected with 
photographic film overlays and visually, after adaptation in a 
dar)croom. The Gram's staining characteristics of each strain 
were established with a commercial Gram's stain Jcit (BBL. 
CocJceysville. MD) used in conjunction with Gram's stain control 
slides (Fisher Scientific. Pittsburgh, PA). Microscopic 
evaluation was then performed using a Zeiss microscope (Carl 
Zeiss, Germany) lOOX oil immersion objective lens (with lOX 
ocular and 2X body magnification). Microscopic examination of 
individual strains for organism size, cellular description and 
inclusion bodies (the latter after logarithmic growthi was 
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performed using wee mount slides (lOX ocular, 2X body and 40X 
objeccive magnification) with oil immersion and phase contrast 
microscopy with a micrometer (Akhurst, R.J. and Boemare, N.E. 
1990. Entomopathogenic Nematodes in Biological Control (ed. 
5 Gaugler, R. and Kaya. H.). pp. 75-90. CRC Press, Boca Raton, 

USA,; Baghdiguian S., Boyer-Gigiio M.H.. Thaler. J.O., Bonnot G., 
Boemare tl. 1993. Biol. Cell 7?, 177-185.). Colony pigmentation 
was observed after inoculation on Bacto nutrient agar, (Difco 
Laboratories. Detroit, MI) prepared as per label instructions. 
10 Incubation occurred at 28°C and descriptions were produced after 
5-7 days. To test for the presence of the enzyme catalase, a 
colony of the test organism was removed on a small plug from a 
nutrient agar plate and placed into the bottom of a glass test 
tube. One ml of a household hydrogen peroxide solution was gently 
15 added down the side of the tube. A positive reaction was 
recorded when bubbles of gas (presumptive oxygen) appeared 
immediately or within 5 seconds. Controls of uninoculated 
nutrient agar and hydrogen peroxide solution were also examined. 
To test for nitrate reduction, each culture was inoculated into 
20 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). 
After 24 hours incubation at 28«>C, nitrite production was tested 
by the addition of two drops of sulfanilic acid reagent and two 
drops of alpha-naphthylamine reagent (see Difco Manual. 10th 
edition, Difco Laboratories, Detroit, MI, 1984). The generation 
25 of a distinct pink or red color indicates the formation of 

nitrite from nitrate. The ability of each strain to uptake dye 
from growth media was tested with Bacto MacConkey agar containing 
the dye neutral red; Bacto Tergitol-7 agar containing the dye 
bromothymol blue and Bacto EMB Agar containing the dye eosin-Y 
30 (agars from Difco Laboratories. Detroit. MI. all prepared 

according to label instructions). After inoculation on these 
media, dye uptake was recorded after incubation at 28''C for 5 
days. Growth on these latter media is characteristic for members 
of the family Encerobacceriaceae. Motility of each strain was 
35 tested using a solution of Bacto Motility Test Medium (Difco 

Laboratories, Detroit. MI) prepared as per label instructions. A 
butt -stab inoculation was performed with each strain and motility 
was judged macroscopically by a diffuse zone of growth spreading 
from the line of inoculum. In many cases, motility was also 
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reserved microscopically from liquid culture under wee moun^ 
slides. Biochemical nutrient evaluation for each strain was 
performed using BBL Enterotube II .Benton, Dickinson. Germany. 
Product instructions were followed with the exception that 
incubation was carried out at as^C for 5 days. Results were 
consistent with previously cited reports for Phocor/,abdus The 
production of protease was tested by observing hydrolysis of 
gelatin using Bacto gelatin (Difco Laboratories. Detroit MI) 
plates made as per label instructions. Cultures were inoculated 
and the plates were incubated at 28oc for 5 days. To assess 
growth at different temperatures, agar plates (2% proteose 
peptone #3 with two percent Bacto-Agar (Difco. Detroit. MI) in 
deionized water] were streaked from a common source of inoculum 
Plates were sealed with Nesco* film and incubated at 20. 28 and " 
370C for up to three weeks. Plates showing no growth at 37oc 
Showed no cell viability after transfer to a 28oc incubator for 
one week. Oxygen requirements for Photorhabdus strains were 
cested in the following manner. A butt-stab inoculation into 
fluid thioglycolate broth medium (Difco. Detroit. Mi) was made 
The tubes were incubated at room temperature for one week and 
cultures were then examined for type and extent of growth The 
indicator resazurin demonstrates the level of medium oxidation or 
the aerobiosis zone (Difco Manual. lOth edition, oifco 
Laboratories. Detroit. MI). Growth zone results obtained for the 
P/iocorhabdiis strains tested were consistent with those of a 
facultative anaerobic microorganism. 
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Table 18 

Taxonomic Traits of Phocorhabdus strains 



Traits Assessed* 
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G=Pres^Je orjat^ia^e i 'Th'^J'^' ^'^itrate reduction. 

J=Pigmentation K-Grow^h on i=Dye uptake, 

M=Gr^ch on ?e;gito"7 La? N-Ff?^f;.^ °" MacCon.key agar. 
20»c P-rr„^^K L NsFacuitative anaerobe, 0=Growth at 

neaativp ? " 0=G^owth at 37oc, t - = positive or 

cellular fatty acid analysis is a recognized tool for 
bacterial characterization at the genus and species l-vel ' 
(Tornabene, T.G. 1985. Lipid Analv.i. . ^d thP 
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Chemccax oncmy in Methods in Microbioloc^^ '. vol 18, 20'?-:/4.; 
Goodfeiiow, M. and O'Donnell, A.G. 1993. Roots of Bacterial 
Systematics in Han dbook of new Bacterial Svstematics (ed. 
Goodfeilow, M. 4 O'Donneil, A.G.) pp. 3-54. London: Academic 
Press Ltd.), these references are incorporated herein by 
reference, and were used to confirm that our collection was 
related at the genus level. Cultures were shipped to an 
external, contract laboratory for fatty acid methyl ester 
analysis (FAME) using a Microbial ID (MIDI, Newark, DE, USA) 
Microbial Identification System (MIS). The MIS system consists of 
a Hewlett Packard HP5890A gas chromatograph with a 25mm x 0.2mm 
5% methylphenyl silicone fused silica capillary column. Hydrogen 
is used as the carrier gas and a flame- ionization detector 
functions in conjunction with an automatic sampler, integrator 
and computer. The computer compares the sample fatty acid methyl 
esters to a microbial fatty acid library and against a 
calibration mix of known fatty acids. As selected by the 
contract laboratory, strains were grown for 24 hours at 28**C on 
trypticase soy agar prior to analysis. Extraction of samples was 
performed by the contract lab as per standard FAME methodology. 
There was no direct identification of the strains to any 
luminescent bacterial group other than Photorhabdus . When the 
cluster analysis was performed, which con^ares the fatty acid 
profiles of a group of isolates, the strain fatty acid profiles 
25 were related at the genus level. 

The evolutionary diversity of the Photorhabdus strains in 
our collection was measured by analysis of PGR (Polymerase Chain 
Reaction) mediated genomic fingerprinting using genomic DNA from 
each strain. This technique is based on families of repetitive 
DNA sequences present throughout the genome of diverse bacterial 
species {reviewed by Versalovic, J., Schneider, M. , DE Bruijn, 
F.J. and Lupski, J.R. 1994. Methods Mol, Cell. Biol., 5, 25-40.). 
Three of these, repetitive extragenic palindromic sequence (REP), 
enterobacterial repetitive intergenic consensus (ERIC) and the 
BOX element are thought to play an important role in the 
organization of the bacterial genome. Genomic organization is 
believed to be shaped by selection and the differential 
dispersion of these elements within the genome of closely related 
bacterial strains can be used to discriminate these strains (e.g. 
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Louws. F.J., Fuibrighc. D.v;. , Stephens, C.T. and DE Bruijn, F.J. 
1994. Appi. Environ. Micro. 60. 2286-2295.). Rep-PCR utilizes 
oligonucleocide primers complementary to these repetitive 
sequences to amplify the variably sized DNA fragments lying 
5 between them. The resulting products are separated by 

electrophoresis to establish the DNA "fingerprint" for each 
strain. 

To isolate genomic DNA from our strains, cell pellets were 
resuspended in TE buffer (10 mM Tris*HCl, 1 mM EDTA, pH 8.0) to a 

lU final volume of 10 ml and 12 ml of 5 M NaCl was then added. This 
mixture was centrifuged 20 min. at 15,000 x g. The resulting 
pellet was resuspended in 5.7 ml of TE and 300 ul of 10% SOS and 
60 ul 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, 
NY) were added. This mixture was incubated at 37 for 1 hr, 

15 approximately 10 mg of lysozyme was then added and the mixture 

was incubated for an additional 45 min. One milliliter of 5M NaCl 
and 800 ul of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were 
then added and the mixture was incubated 10 min. at 65''C, gently 
agitated, then incubated and agitated for an additional 20 min. 

20 to aid in clearing of the cellular material. An equal volume of 
chloroform/ isoamyl alcohol solution (24:1, v/v) was added, mixed 
gently then centrifuged. Two extractions were then performed with 
an equal volume of phenol/chlorof orm/isoamyl alcohol (50:49:1). 
Genomic DNA was precipitated with 0.6 volume of isopropanol. 

25 Precipitated DNA was removed with a glass rod, washed twice with 

70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 

PH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quant itated by 

optical density at 260 nm. To perform rep-PCR analysis of 

Phocorhabdus genomic DNA the following primers were used, REPIR- 

30 I; 5' -IIIICGICGICATCIGGC-3 ' and REP2-I; 5 * -ICGICTTATCIGGCCTAC-3 ' . 
PCR was performed using the following 25ul reaction: 7.75 ul H2O, 

2.5 ul lOX LA buffer (PanVera Corp., Madison, WI), 16 ul dNTP mix 
(2.5 mM each), 1 ul of each primer at 50 pM/ul, 1 ul DMSO, 1.5 ul 

genomic DNA (concentrations ranged from 0.075-0.480 ug/uD and 
35 0.25 ul TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR 

amplification was performed in a Perkin Elmer DNA Thermal Cycler 
(Norwalic, CT) using the following conditions: 95®C/7 min. then 35 

cycles of; 94'*C/1 min.,44**C/l min., 65*'C/8 min., followed by 15 

min. at 65*C. After cycling, the 25 ul reaction was added to 5 ui 
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of 6X gel loading butter (0.25% bromophenol blue. 40% ■„■; 3U'-^'^so 
m H20). A 15x20cm il-agarose gel was then run in TBE buffer 
(0.09 M Tris-borate. 0.002 M EDTA) using 8 ui of each reaction 
The gel was run for approximately 16 hours ac 45v. Gels were then 
stained in 20 ug/ml ethidium bromide for 1 hour and destained in 
TBE buffer for appro.ximately 3 hours. Polaroid® photographs of 
the gels were then taken under UV illumination. 

The presence or absence of bands at specific sizes for ea^h 
strain was scored from the photographs and entered as a 
similarity matrix in the numerical taxonomy software program 
OTSYS-pc (Exeter Software. Setauket, NY). Controls of £ coii 
strain HBlOl and Xanthomonas nr yzae pv. ory ..^ assayed at the 
same time produced PCR "fingerprints" corresponding to published 
reports (Versalovic. j. , Koeuch. T. and Lupski, j.r. 1991 
Nucleic Acids Res. 19, 6823-6831; Vera Cruz. CM.. Halda-Alija, 
L.. Louws, F.. Skinner. D.2., George. M.L.. Nelson. R.j.. de 
Bruijn. F.J., Rice. C. and Leach. J.E. 1995. Int. Rice Res 
Notes. 20. 23-24... Vera Cruz. CM., Ardales. E.Y.. skinner. D.z.. 
Talag. J., Nelson. R.J.. Louws. F.J. , Leung, H. . Mew, T.W. and 
Leach, J.E. 1996. Phytopathology (in press, respectively). The 
data from Photorhabdus strains were then analyzed with a series 
of programs within NTSYS-pc; SIMQUAL (Similarity for Qualitative 
data) to generate a matrix of similarity coefficients (using the 
Jaccard coefficient) and SAHN (Sequential. Aggloraerative, 
Heirarchical and Nested) clustering [using the UPGMA (Unweighted 
Pair-Group Method with Arithmetic Averages) method! which groups 
related strains and can be expressed as a phenogram (Figure 5). 
The COPH (cophenetic values) and MXCOMP (matrix comparison) 
programs were used to generate a cophenetic value matrix and 
compare the correlation between this and the original matrix upon 
which the clustering was based. A resulting normalized Mantel 
statistic (r) was generated which is a measure of the goodness of 
fit for a cluster analysis (r»0. 8-0.9 represents a very good 
fit). in our case r = 0.919. Therefore, our collection is 
comprised of a diverse group of easily distinguishable strains 
representative of the Photorhabdus genus. 
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Example 13 

Insecticidal Utilicy of Toxin (s) Produced 
by Various Fhocorhabdus Strains 

5 Initial "seed" cultures of the various Phocorhabdus strains 

were produced by inoculacing 17 5 ml of 2% Proteose Peptone «3 
(PP3) (Difco Laboratories. Detroit, MI) liquid media with a 
primary variant subclone in a 500 ml tribaffled flask with a 
Delong neck, covered with a Kaput. Inoculum for each seed culture 

10 was derived from oil-overlay agar slant cultures or plate 

cultures. After inoculation, these flasks were incubated for 16 
hrs at 28^C on a rotary shaker at 150 rpm. These seed cultures 
were then used as uniform inoculum sources for a given 
fermentation of each strain. Additionally, overlaying the post- 

15 log seed culture with sterile mineral oil, adding a sterile 

magnetic stir bar for future resuspension and storing the culture 
in the dark, at room temperature provided long-term preservation 
of inoculum in a toxin-competent state. The production broths 
were inoculated by adding 1% of the actively growing seed culture 

20 to fresh 2% PP3 media le.g. 1.75 ml per 175 ml fresh media). 

Production of broths occurred in either 500 ml tribaffled flasks 
(see above), or 2800 ml baffled, convex bottom flasks (500 ml 
volume) covered by a silicon foam closure. Production flasks 
were incubated for 24-48 hrs under the above mentioned 

25 conditions. Following incubation, the broths were dispensed into 
sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 
lO^C and decanted from the cell and debris pellet. The liquid 
broth was then vacuum filtered through VJhatman GF/D (2,7 uM 
retention) and GF/B (1.0 uM retention) glass filters to remove 

30 debris. Further broth clarification was achieved with a 
tangential flow microf iltration device (Pall Filtron, 
Northborough, MA) using a 0.5 uM open-channel filter, when 
necessary, additional clarification could be obtained by chilling 
the broth (to 4®C) and centrifuging for several hours at 2600 x 

35 g. Following these procedures, the broth was filter sterilized 
using a 0.2 uM nitrocellulose membrane filter. Sterile broths 
were then used directly for biological assay, biochemical 
analysis or concentrated (up to 15-fold) using a 10,000 rnv cut- 
off. M12 ultra-filtration device (Amicon, Beverly MA) or 
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cencriiugai concentrators (Millipore. Bedford, MA and Pail 
Fiitron, Northborough. MA) with a 10,000 liW pore size. in the 
case of centrifugal concentrators, the broth was spun at 2000 x g 
for approximately 2 hr. The 10,000 MW permeate was added to the 
5 corresponding retentate to achieve the desired concentration of 
components greater than 10,000 MW. Heat inactivation of 
processed broth samples was acheived by heating the samples at 
lOO^C in a sand* filled heat block for 10 minutes. 

The broth (s) and toxin complex (es> from different 

lU Phocorhabdus strains are useful for reducing populations of 
insects and were used in a method of inhibiting an insect 
population which comprises applying to a locus of the insect an 
effective insect inactivating amount of the active described. A 
demonstration of the breadth of insect icidal activity observed 

15 from broths of a selected group of Phocorhabdus strains fermented 
as described above is shown in Table 19. It is possible that 
additional insecticidal activities could be detected with these 
strains through increased concentration of the broth or by 
employing different fermentation methods. Consistent with the 

20 activity being associated with a protein, the insecticidal 
activity of all strains tested was heat labile (see above) . 

Culture broth(s) from diverse Phocorhabdus strains show 
differential insecticidal activity (mortality and/or growth 
inhibition, reduced adult emergence) against a number of insects. 

25 More specifically, the activity is seen against corn rootworm 
larvae and boll weevil larvae which are members of the insect 
order Coleopcera. Other members of the Coleopcera include 
wireworms, pollen beetles, flea beetles, seed beetles and 
Colorado potato beetle. Activity is also observed against aster 

30 leaf hopper and corn plant hopper, which are members of the order 
Homopcera. Other members of the Homopcera include planthoppers , 
pear psylla, apple sucker, scale insects, whiteflies, spittle 
bugs as well as numerous host specific aphid species. The broths 
and purified toxin complex (es) are also active against tobacco 

35 budworm, tobacco hornworm and European corn borer which are 

members of the order Lepidopcera. Other typical members of this 
order are beet armyworm, cabbage looper, black cutworm, corn 
earworm, codling moth, clothes moth, Indian mealmoth, leaf 
rollers, cabbage worm, cotton bollworm. bagworm. Eastern tent 
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caterpillar, sod webworm and fall armyworm. Activity is also 
seen against fruit fly and mosquito larvae which are members oi 
the order Dipcera. Other members of the order Dipcera are, pea 
midge, carrot fly. cabbage root fly. turnip root fly. onion fly, 
crane fly and house fly and various mosquito species. Activity 
with broth(s) and toxin complex(es) is also seen against two- 
spotted spider mite which is a member of the order Acarina which 
includes strawberry spider mites, broad mites, citrus red mite, 
European red mite, pear rust mite and tomato russet mite. 

Activity against corn rootworra larvae was tested as follows. 
Photorhabdus culture broth(s) (0-15 fold concentrated, filter 
sterilized). 2% Proteose Peptone #3, purified toxin complex(es) 
[0.23 mg/mlj or 10 mM sodium phosphate buffer , pH 7.0 were 
applied directly to the surface (about 1.5 cm2) of artificial 
diet (Rose. R. I. and McCabe. J. M. (1973). j. Econ. Entomol. 66, 
(398-400) in 40 pi aliquot s. Toxin complex was diluted in 10 mM 
sodium phosphate buffer. pH 7.0. The diet plates were allowed to 
air-dry in a sterile flow-hood and the wells were infested with 
single, neonate Diabrocica undecin^unctaca howardi (Southern corn 
rootworm, SCR) hatched from surface sterilized eggs. The plates 
were sealed, placed in a humidified growth chamber and maintained 
at 270C for the appropriate period (3-5 days). Mortality and 
larval weight determinations were then scored. Generally. 16 
insects per treatment were used in all studies. Control 
mortality was generally less than 5%. 

Activity against boll weevil {Anchomonas grandis) was tested 
as follows. Concentrated (l-io fold) Photorhabdus broths, 
control medium (2% Proteose Peptone #3), purified toxin 
complex(es) (0.23 mg/ml} or 10 mM sodium phosphate buffer. pH 7.0 
were applied in 60 ul aiiquots to the surface of 0.3 5 g of 
artificial diet (Stonevilie Yellow lepidopteran diet) and allowed 
to dr^'. A single. 12-24 hr boll weevil larva was placed on the 
diet, and the wells v/ere sealed and held at 25«'C. 50* RH for 5 
days. Mortality and larval weights were then assessed. Control 
35 mortality ranged between 0-13%. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96-well microtiter plate. Each well 
contained 200 ul of aqueous solution (10- fold concentrated 
Photorhabdus culture broth{s). control medium (2* Proteose 
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Peptone #3), 10 mM sodium phosphate buffer, toxin compiextes) a 
0.23 mg/ml or H2O) and approximately 20, 1-day old larvae Medes 
aegypci;. There were 6 wells per treatment. The results were 
read at 3-4 days after infestation. Control mortality was 
5 between 0-20%. 

Activity against fruitflies was tested as follows. 
Purchased Drcsophlla melanogdscer medium was prepared using 50% 
dry medium and a 50% liquid of either water, control medium (2% 
Proteose Peptone #3), 10-fold concentrated Photorhatdus culture 
broth(s), purified toxin complex(es) [0.23 mg/mlj or 10 mM sodi 
phosphate buffer , pH 7.0. This was accomplished by placing 4.0 
ml of dry medium in each of 3 rearing vials per treatment and 
adding 4.0 mi of the appropriate liquid. Ten late instar 
Drosophlla melanogascer maggots were then added to each 25 ml 
vial. The vials were held on a laboratory bench, at room 
temperature, under fluorescent ceiling lights. Pupal or adult 
counts were made after 15 days of exposure. Adult emergence as 
compared to water and control medium (0-16% reduction) . 

Activity against aster leafhopper adults iMacrosceles 
severini) and corn planthopper nymphs iPeregrinus maidis) was 
tested with an ingestion assay designed to allow ingestion of the 
active without other external contact. The reservoir for the 
active/ "food- solution is made by making 2 holes in the center of 
the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm 
25 square is placed across the top of the dish and secured with 

an -O" ring. A 1 oz. plastic cup is then infested with 
approximately 7 hoppers and the reservoir is placed on top of the 
cup, Parafilm down. The test solution is then added to the 
reservoir through the holes. In tests using 10-fold concentrated 
P/iotorhabdus culture broth (s ) . the broth and control medium (2% 
Proteose Peptone #3) were dialyzed against 10 mM sodium phosphate 
buffer, pH 7.0 and sucrose (to 5%) was added to the resulting 
solution to reduce control mortality. Purified toxin complex (es) 
[0.23 mg/ml) or 10 mM sodium phosphate buffer, pH 7.0 was also 
tested. Mortality is reported at day 3. The assay was held in 
an incubator at 28oc, 70% RH with a 16/8 photoperiod. The assays 
were graded for mortality at 72 hours. Control mortality was 
less than 6%. 
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Accivicy against lepidcpceran larvae was tested as fciiOAS. 
Concentrated tlO-fold) Phocorhabdus culture broth(s), control 
medium (2% Proteose Peptone #3), purified toxin compiexiesj [0.23 
mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied 
5 directly to the surface (-1.5 cm^) of standard artificial 

lepidopteran diet (Stoneville Yellow diet) in 40 ul aliquots. 
The diet plates were allowed to air-dry in a sterile flow-hood 
and each well was infested with a single, neonate larva. European 
corn borer iOscrinia nubilalis) and tobacco hornworm iManduca 

10 sexta) eggs were obtained from commercial sources and hatched in- 
house, whereas tobacco budworm (Heliochis virescens) larvae were 
supplied internally. Following infestation with larvae, the diet 
plates were sealed, placed in a humidified growth chamber and 
maintained in the dark at 27«>c for the appropriate period. 

15 Mortality and weight determinations were scored at day 5. 

Generally, 16 insects per treatment were used in all studies. 
Control mortality generally ranged from 4-12.5% for control 
medium and was less than 10% for phosphate buffer. 



20 urcicae) was determined as follows. Young squash plants were 

trimmed to a single cotyledon and sprayed to run-off with 10- fold 
concentrated broth(s). control medium (2% Proteose Peptone #3), 
purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate 
buffer, pH 7,0. After drying, the plants were infested with a 

25 mixed population of spider mites and held at lab temperature and 
humidity for 72 hr. Live mites were then counted to determine 
levels of control. 



Activity against two-spotted spider mite {Tetranychus 
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Table 19 

Observed Inseccicidal Spectrum of Broths From Different 

Phocorhabdus Strains 



Sensitive* Insect Species 



WX-i 


3»*, 


4. 


5, 


6. 


7. 


WX-2 


2, 


4 










WX-3 


1. 


4 










WX-4 


1. 


4 










WX-5 


4 












WX-6 


4 












WX-7 


. 3, 


4, 


5, 


6, 


» « 


8 


WX-8 


1. 


2, 


4 








WX-9 


1. 


•> 


4 








WX-10 


4 












WX-11 


1. 


2, 


4 








WX-12 


2. 


4, 


5, 


6. 


7. 


8 


WX-14 


1. 


2, 


4 








WX-15 


1, 


2, 


4 








W30 


3. 


4, 


5. 


8 






NC-1 . 


1, 


2, 


3, 


4. 


5, 


6, 


WIR 


2. 


3. 


5, 


6, 


7. 


8 


HP88 


1. 


3. 


4, 


5, 


7, 


8 


Hb 


3, 


4-, 


5, 


7, 


8 




Hm 


1, 


2. 


3. 


4, 


5, 


7. 


H9 


1, 


2. 


3, 


4. 


5. 


6. 


W-14 


1. 


2. 


3, 


4, 


5, 


6. 


ATCC 43948 


4 












ATCC 43949 


4 












ATCC 43950 


4 












ATCC 43951 


4 












ATCC 43 952 


4 













8, 9 



8 



10 



> 25% mortality and/or growth inhibition vs. control 
1; Tobacco budworm, 2; European corn borer. 3; 
Tobacco hornworm. 4; Southern corn rootworm, 5; 
Boll weevil, 6; Mosquito. 7; Fruit Fly, 8; 
Aster Leaf hopper. 9; Corn planthopper, 10; 
Two-spotted spider mite. 
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Example 14 
Non V/-i4 Phocorhabdus Strains; 
Purification. Characterization and Activicy Spectrum 

5 Purification 

The protocol, as follows, is similar to that developed for 
the purification of w-14 and was established based on purifying 
those fractions having the most activity against Southern corn 
root worm (SCR), as determined in bioassays (see Example 13). 

10 Typically. 4-20 L of broth that had been filtered, as described 
in Example 13, were received and concentrated using an Amicon 
spiral ultra filtration cartridge Type SIYIOO attached to an 
Amicon M-12 filtration device. The retentate contained native 
proteins consisting of molecular sizes greater than 100 JcDa, 

15 whereas the flow through material contained native proteins less 
than 100 kDa in size. The majority of the activity against SCR 
was contained in the 100 kDa retentate. The retentate was then 
continually diafiltered with 10 mM sodium phosphate (pH = 7.0) 
until the filtrate reached an A280 < 0.100. Unless otherwise 

20 stated, all procedures from this point were performed in buffer 
as defined by 10 mM sodium phosphate (pH 7.0). The retentate was 
then concentrated to a final volume of approximately 0.20 L and 
filtered using a 0.45 mm Nalgene™ Filterware sterile filtration 
unit. The filtered material was loaded at 7.5 ml/min onto a 

25 Pharmacia HR16/10 column which had been pac)ced with PerSeptive 
Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated 
in buffer using a PerSeptive Biosystem Sprint® HPLC system. 
After loading, the coliimn was washed with buffer until an A280 ' 

0.100 was achieved. Proteins were then eluted from the column at 
30 2.5 ml/min using buffer with 0.4 M NaCl for 20 min for a total 
volume of 50 mi. The column was then washed using buffer with 
1.0 M NaCl at the same flow rate for an additional 20 min (final 
volume = 50 ml). Proteins eluted with 0.4 M and 1.0 M NaCl were 
placed in separate dialysis bags ( Spectra /Por® Membrane MWCO: 
35 2,000) and allowed to dialyze overnight at 4^ C in 12 L buffer. 
The majority of the activity against SCR was contained in the 0.4 
M fraction. The 0.4 M fraction was further purified by 
application of 20 ml to a Pharmacia XK 26/100 column that had ■ 
been prepaciced with Sepharose CL4B (Pharmacia) using a flow rate 
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of 0.75 ml/min. Fractions were pooled based cn A280 peak profile 
and concentrated to a final volume of 0.75 ml using a Miilipore 
Ultra free®- 15 centrifugal filter device Biomax-SOK ^JMV/L membrane. 
Protein concentrations were determined using a Biorad Protein 
5 Assay Kit with bovine gamma globulin as a standard. 

Characterisation 

The native molecular weight of the SCR toxin complex was 
determined using a Pharmacia HR 16/50 that had been prepacked 
10 with Sepharose CL4B in buffer. The column was then calibrated 
using proteins of known molecular size thereby allowing for 
calculation of the toxin approximate native molecular size. As 
shown in Table 20, the molecular size of the toxin complex ranged 
from 777 kDa with strain Hb to 1,900 kDa with strain WX-14. The 
15 yield of toxin complex also varied, from strain WX-12 producing 
0.8 mg/L to strain Hb, which produced 7.0 mg/L. 

Proteins found in the toxin complex were examined for 
individual polypeptide size using SDS-PAGE analysis. Typically, 
20 mg protein of the toxin complex from each strain v/as loaded 
20 onto a 2-15% polyacrylamide gel (Integrated Separation Systems) 
and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After 
completion of electrophoresis, the gels were stained overnight in 
Biorad Coomassie blue R-250 (0.2% in methanol: acetic acid: 
water; 40:10:40 v/v/v) . Subsequently, gels were destained in 
25 methanol: acetic acid: water; 40:10:40 (v/v/v). The gels were 
then rinsed with water for 15 min and scanned using a Molecular 
Dynamics Personal Laser Densitometer®. Lanes were quant itated 
and molecular sizes were calculated as compared to Biorad high 
molecular weight standards, which ranged from 200-45 kDa. 
30 Sizes of the individual polypeptides comprising the SCR 

toxin complex from each strain are listed in Table 21. The sizes 
of the individual polypeptides ranged from 230 kDa with strain 
wx-1 to a size of 16 kDa, as seen with strain WX-7. Every 
strain, with the exception of strain Hb, had polypeptides 
35 comprising the toxin complex that were in the 160-230 kDa range, 
the 100-160 kDa range, and the 50-80 kDa range. These data 
indicate that the toxin complex may vary in peptide composition 
and components from strain to strain, however, in all cases the 
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toxin accribuces appears co consist of a large, oligomeric 
procein complex. 

Table 20 

5 Characterization of a Toxin Complex From 

Non w-14 Phocorhabdus Strains 



Strain 


Approx . 
Native 

Molecular Wt.^ 


Yield 
Active 
Fraction 

(mg/L)b 


H9 


972.000 


1.8 


Hb 


777,000 


7.0 


Hm 


1.400.000 


1.1 


HP88 


813.000 


2.5 


NCI 


1.092,000 


3.3 


WIR 


979,000 


1.0 


WX-1 


973.000 


0.3 


WX-2 


951.000 


2.2 


WX-7 


1,000,000 


1.5 


WX-i2 


898,000 


0.4 


WX-14 


1,900.000 


1.9 


W-14 


860.000 


7.5 



a Native molecular weight determined using a Pharmacia HR 

16/50 column packed with Sepharose CL4fi 
b Amount of toxin complex recovered from culture broth. 

Activity Spectrum 

10 As shown in Table 21. the toxin complexes purified from 

strains Hm and H9 were tested for activity against a variety of 
insects, with the toxin complex from strain W-14 for comparison. 
The assays were performed as described in Example 13. The toxin 
complex from all three strains exhibited activity against tobacco 

15 bud worm, European corn borer. Southern corn root worm, and aster 
leaf hopper. Furthermore, the toxin complex from strains Hm and 
w-14 also exhibited activity against two-spotted spider mite. In 
addition, the toxin complex from W-14 exhibited activity against 
mosquito larvae. These data indicate that the toxin complex, 

20 while having similarities in activities between certain orders of 
insects, can also exhibit differential activities against other 
orders of insects. 
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Table 21 

The Approximate Sizes <in kDa) of Peptides in a Purified 
Toxin Complex From Non W-14 Phocorhabdus 
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Table 22 

Observed Insecticidai Spectrum of a Purified Toxin Complex from 

Phocorhabdus Strains 



10 



15 



Phocorhabdus Strain 



Hm Toxin Complex 
H9 Toxin Complex 
W-i4 Toxin Complex 



Sensitive* Insect Species 

I**. 2, 3, 5. 6. 7. 8 

If 2 1 3 • 6 , 7 , 8 

1, 2, 3. 4, 5. 6. 7, 3 



= > 25% mortality or growth inhibition 
= > 25% mortality or growth inhibition 

= 1; Tobacco bud worm. 2; European corn borer, 3; Southern 
corn root worm, 4; Mosquito, 5; Two-spotted spider mite, 
6; Aster Leaf hopper, 1; Fruit Fly, 8; Boll Weevil 



20 



25 



30 



35 



40 



Example 15 

Sub- Fractionation of P/iocorhaJbdiis Protein Toxin Complex 



The Phocorhabdus protein toxin complex was isolated as 
described in Example 14. Next, about 10 mg toxin was applied to 
a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a 
flow rate of Iml/min. The column was washed with 20 rnM Tris-HCl, 
pH 7.0 until the optical density at 280 nm returned to baseline 
absorbance. The proteins bound to the column were eluted with a 
linear gradient of 0 to 1.0 M NaCl in 20 mM Tris-HCl, pH 7.0 at 1 
ml/min for 30 min. One ml fractions were collected and subjected 
to Southern corn rootworm (SCR) bioassay (see Example 13). Peaks 
of activity were determined by a series of dilutions of each 
fraction in SCR bioassays. Two activity peaks against SCR were 
observed and were named A (eluted at about 0.2-0.3 M NaCl) and B 
(eluted at 0.3-0.4 M NaCl). Activity peaks A and B were pooled 
separately and both peaks were further purified using a 3-step 
procedure described below. 

Solid (NH4)2S04 was added to the above protein fraction to a 
final concentration of 1.7 m. Proteins were then applied to a 
phenyl -Superose 5/5 column equilibrated with 1.7 M (NH4)2S04 in 
50 mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins 
bound to the column were eluted with a linear gradient of 1.7 M 
(NH4I2S04, 0% ethylene glycol, 50 mM potassium phosphate, pH 7.0 

to 25% ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no 
(NH4)2S04) at 0.5 ml/min. Fractions were dialyzed overnight 
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against 10 mM sodium phosphate buffer. pH 7.0. Activities in 
each traction against SCR were determined by bioassay. 

The fractions with the highest activity were pooled and 
applied to a MonoQ 5/5 coiiimn which was equilibrated with 20 mM 
5 Tris-HCl, pH 7.0 at 1 ml/min. The proteins bound to the column 
were eiuted at 1 ml/min by a linear gradient of 0 to IM NaCl in 
20 mM Tris-HCl, pH 7.0. 

For the final step of purification, the most active 
fractions above (determined by SCR bioassay) were pooled and 
10 subjected to a second phenyl -Superose 5/5/ column. Solid 
(NH4}2S04 was added to a final concentration of 1.7 M. The 
solution was then loaded onto the column equilibrated with 1.7 m 
(NH4)2S04 in 50 mM potassium phosphate buffer, pH 7 at Iml/min. 

Proteins bound to the column were eiuted with a linear gradient 
IS of 1.7 M (NH4)2S04r 50 mM potassium phosphate, pH 7.0 to 10 mM 
potassium phosphate, pH 7.0 at 0.5 ml/min. Fractions were 
dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. 
Activities in each fraction against SCR were determined by 
bioassay* 

20 The final purified protein by the above 3-step procedure 

from peak A was named toxin A and the final purified protein from 
peak B was named toxin B. 



Characterization and Amino Acid Sequencing of Toxin A and Toxin B 

2S In SDS-PAGE. both toxin A and toxin B contained two major (> 

90% of total Commassie stained protein) peptides: 192 kDa (named 
Al and Bl, respectively) and 58 kDa (named A2 and B2. 
respectively). Both toxin A and toxin B revealed only one major 
band in native PAGE, indicating Al and A2 were subunits of one 

30 protein complex, and Bl and B2 were subunits of one protein 

complex. Further, the native molecular weight of both toxin A 
and toxin B were determined to be 860 kDa by gel filtration 
chromatography. The relative molar concentrations of Al to A2 
was judged to be a 1 to 1 equivalence as determined by 

35 densiometric analysis of SDS-PAGE gels. Similarly. Bl and B2 
peptides were present at the same molar concentration. 

Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and 
cransblotted to PVDF membranes. Blots were sent for amino acid 
analysis and N-terminal amino acid sequencing at Harvard 

40 MicroChem and Cambridge ProChem, respectively. The N- terminal 
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amino sequence of Bi was determined to be identical to SEQ ID 
NO:i. the TcbAii region ot the cctA gene (SEQ ID NO: 12. position 
37 to 99) , A uniciue N-terminal sequence was obtained for peptide 
B2 ISEQ ID NO:40). The M-terminal amino acid sequence of peptide 
5 82 was identical to the TcbAii i region of the derived amino acid 

sequence for the ccbA gene (SEQ ID N0:12, position 1935 to 1945). 
Therefore, the B toxin contained predominantly two peptides. 
TcbAii and TcbAiii, that were observed to be derived from the 

same gene product. TcbA. 
10 The N- terminal sequence of A2 (SEQ ID NO: 41) was unique in 

comparison to the TcbAiii peptide and other peptides. The A2 
peptide was denoted TcdAiii (see Example 17). SEQ ID N0:6 was 
determined to be a mixture of amino acid sequences SEQ ID NO: 40 
and 41. 

IS Peptides Al and A2 were further subjected to internal amino 

acid sequencing. For internal amino acid sequencing, 10 ug of 
toxin A was electrophoresized in 10% SDS-PAGE and transblotted to 
PVDF membrane. After the blot was stained with amido black, 
peptides Al and A2. denoted TcdAii and TcdAiii. respectively. 

20 were excised from the blot and sent to Harvard MicroChem and 

Cambridge ProChem. Peptides were subjected to trypsin digestion 
followed by HPLC chromatography to separate individual peptides. 
N-terminal amino acid analysis was performed on selected tryptic 
peptide fragments* Two internal amino acid sequences of peptide 

25 Al (TcdAii-PK71, SEQ ID NO:38 and TcdAii-PK44, SEQ ID NO:39) were 
found to have significant homologies with deduced amino acid 
sequences of the TcbAii region of the ccbA gene (SEQ ID N0:12). 

Similarly, the N- terminal sequence (SEQ ID NO: 41) and two 
internal sequences of peptides A2 (TcdAiii-PK57 . SEQ ID NO: 42 and 

30 TcdAiii-PK20. SEQ ID NO. 43) also showed significant homology with 

deduced amino acid sequences of TcbAiii region of the ccbA gene 

(SEQ ID N0:12) . 

In summary of above results r the toxin complex has at least 
two active protein toxin complexes against SCR; toxin A and toxin 
35 B. Toxin A and toxin B are similar in their native and subunits 
molecular weight, however, their peptide compositions are 
different. Toxin A contained peptides TcdAii and TcdAiii as the 
major peptides and the toxin B contains TcbAii and TcbAiii as the 
major peptides. 
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Example 16 
Cleavage and Activation of TcbA Peptide 

5 

In the toxin B complex, peptide TcbAii and TcbAiii originate 

from the single gene product TcbA (Example 15). The processing of 
TcbA peptide to TcbAn and TcbAin is presumably by the action of 
Phocorhabdus protease(s) , and most likely, the metalloproteases 
10 described in Example 10. In some cases, it was noted that when 

Phocorhabdus W-14 broth was processed, TcbA peptide was present in 
toxin B complex as a major component, in addition to peptides 
TcbAii and TcbAiii. Identical procedures, described for the 

purification of toxin B complex (Example 15) . were used to enrich 
15 peptide TcbA from toxin complex fraction of W-14 broth. The final 
purified material was analyzed in a 4-20% gradient SDS-PAGE and 
major peptides were quantified by densitometry. it was determined 
that TcbA, TcbAii and TcbAiii comprised 58%, 36%, and 6%, 

respectively, of total protein. The identities of these peptides 
20 were confirmed by their respective molecular sizes in SDS-PAGE and 
Western blot analysis using monospecific antibodies. The native 
molecular weight of this fraction was determined to be 860 kDa, 

The cleavage of TcbA was evaluated by treating the above 
purified material with purified 38 kDa and 58 kDa W-14 
.25 Phocorhabdus metalloproteases (Example 10), and Trypsin as a 

control enzyme (Sigma, MO). The standard reaction consisted 17.5 
ug the above purified fraction, 1.5 unit protease, and 0,1 M Tris 
buffer, pH 8.0 in a total volume of 100 ul. For the control 
reaction, protease was omitted. The reaction mixtures were 
30 incubated at 37 for 90 min. At the end of the reaction, 20 ul 
was taken and boiled with SDS-PAGE sample buffer immediately for 
electrophoresis analysis in a 4-20% gradient SDS-PAGE. It was 
determined from SDS-PAGE that in both 38 kDa and 58 kDa protease 
treatments, the amount of peptides TcbAii and TcbAiii increased 

35 about 3 -fold while the amount of TcbA peptide decreased 
proportionally (Table 23). The relative reduction and 
augmentation of selected peptides was confirmed by Western bloc 
analyses. Furthermore, gel filtration of the cleaved material 
revealed that the native molecular size of the complex remained 

40 Che same. Upon trypsin treatment, peptides TcbA and TcbAii were 
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nonspecif ically digested into small peptides. This indicated chat 
38 kDa and 58 kDa Phocorhabdus proteases can specifically process 
peptide TcbA into peptides TcbAn and TcbAiii. Protease treated 

and untreated control of the remaining 90 ul reaction mixture were 
S serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and 
analyzed by SCR bioassay. By comparing activity in several 
dilution, it was determined that the 38 kDa protease treatment 
increased SCR insecticidal activity approximately 3 to 4 fold. 
The growth inhibition of remaining insects in the protease 
10 treatment was also more severe than control (Table 23). 

Table 23 

Conversion and activation of peptide TcbA into peptides TcbAii and 

TcbAiii by protease treatment. 
15 Control 38 kDa protease treatment 

50 (% of total protein) 58 18 

51 (% of total protein) 36 64 
S9 (% of total protein) 6 18 
LDSO iug protein) 2.1 0.52 

20 SCR Weight (mg/ insect)* 0.2 0.1 

~ an indication of growth inhibition by measuring the average 
weight of live insect after 5 days on diet in the assay. 



25 Example 17 

Screening of the library for a gene encoding the T cdAi i Peptide 

The cloning and characterization of a gene encoding the 
TcdAii peptide, described as SEQ ID NO: 17 (internal peptide 

30 TcdAii-PTlll N-terminai sequence) and SEQ ID NO: 18 (internal 
peptide TcdAii-PT79 N-terminal sequence) was completed. Two 
pools of degenerate oligonucleotides, designed to encode the 
amino acid sequences of SEQ ID NO: 17 (Table 24) and SEQ ID NO: 18 
(Table 25), and the reverse complements of those sequences, were 

35 synthesized as described in Example 8. The DNA sequence of the 
oligonucleotides is given below: 



-102- 

suBRTm nr shfft muLE 2fij 



wo 97/17432 



PCrAJS96/18003 



CO I • 



CO 

u 
o 



r> Im lu 

fete ife 



ho 



2bb 



,< <I<|C3 



2& 



0) 
•D 

•H 
iJ 

o 

O 

C 
O 



O |<n 



0) 

u 

C 

<u 
a 



P 



-< 5: L r l< 



• u u|u 



5^ ^ i< 



C3 



ulu 




O 



o 
z 



u 
o 

0) 



iJ 

o 

0) 

u 

c 
o 

01 

o 

kl 

c 

0) 
01 
Q) 
Q 



a 



o 



CO Ih 



ho I 9 



,5 



z jo: toe lulu 
u|o|o|< < 



r 5 pp^ 5 



< 



> l(^lolc9la:lu 



O IOC 



l§ 18 Is 1^1^ 



vo 



CD 

\< 
U 



klo: 



1C« <l<|< 



>• 



1:1: 



88 



fS l=c |>« 



l: l: |: 



\< \<\< K < 



1:1: 



<\< 



in Im (in Imlto 




lu 



|r« ir« 



U 
O 



< 

II 
X 

H u 

o o 
u < 

n .11 

> z 

W C 
0^ 10 
T3 

•H 

o 

^ o 

u 

a < 
c 

II 

o a: 



0) 

0 O 

u 

o 

m 

o ti 

1 ^ 

o 

< ^ 

CU £-1 
D 

0 

o 

c u 
^< < 

o 

U tl 

u 

< z 



-103- 



WW 



r ^ 1 / uayo/ 1 ouu j 



10 



15 



Polymerase Chain Reaccions <PCP.) were pertormed essencialiv 
as described in Example 8, using as forward primers F2.3.5.CB or 
P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in ail 
forward/reverse combinations, using Phocorhabdus W-14 genomic DMA 
as template, in another set of reactions, primers P2.79,2 or 
P2.7 9.3 were used as forward primers, and P2.3.5R, P2.3.5F.I, and 
P2.3R.CB were used as reverse primers in ail forward/reverse 
combinations, only in the reactions containing P2.3.6.CB as the 
forward primers combined with P2.79.R.1 or P2.79R.CB as the 
reverse primers was a non-artifactual amplified product seen, of 
estimated size (mobility on agarose gels) of 2500 base pairs. 
The order of the primers used to obtain this amplification 
product indicates that the peptide fragment TcdAij-PTlll lies 
amino-proximal to the peptide fragment TcdAii-PT79. 

The 2500 bp PCR products were ligated to the plasmid vector 
pCR""!! (Invitrogen, San Diego. CA) according to the supplier's 
instructions, and the DNA sequences across the ends of the insert 
fragments of two isolates (HS24 and HS27) were determined using 
the supplier's recommended primers and the sequencing methods 
described previously. The sequence of both isolates was the 
same. New primers were synthesized based on the determined 
sequence, and used to prime additional sequencing reactions to 
obtain a total of 2557 bases of the insert (SEQ id N0:36J. 
Translation of the partial peptide encoded by SEQ ID No: 36 
25 yields the 845 amino acid sequence disclosed as SEQ ID NO: 37. 
Protein homology analysis of this portion of the TcdAn peptide 
fragment reveals substantial amino- acid homology (68% similarity; 
53% identity) to residues 542 to 1390 of protein TcbA (SEQ ID 
NO: 12]. It is therefore apparent that the gene represented in 
part by SEQ ID NO:36 produces a protein of similar, but not 
identical, amino acid sequence as the TcbA protein, and which 
likely has similar, but not identical biological activity as the 
TcbA protein. 

In yet another instance, a gene encoding the peptides 
TcdAii-PK44 and the TcdA^i, 58 )cDa N-terminal peptide, described 
as SEQ ID N0:9 (internal peptide TcdAij-PK44 sequence), and SEQ ID 
N0:41(TcdAiii 58 kDa N-terminal peptide sequence) was isolated. 
Two pools of degenerate oligonucleotides, designed to encode the 
amino acid sequences described as SEQ ID NO: 39 (Table 27 i and SEQ 



20 



30 



35 
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ID N0:41 (Table 26), and the reverse complements or chose 
sequences, were synthesized as described in Example 3. and the 
DNA sequences. 
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Polymerase Chain Reactions (PCRl vere performed essential iy 
as described in Example 9, using as forward primers Al.44.1 or 
Al.44.2. and reverse primers A2.3R or A2.4R, in all 
forward/ reverse combinations, using Phocorhabdus W-14 genomic DNA 
as template, in another set of reactions, primers A2.1 or A2.2 
were used as forward primers, and A1.44.1R. and A1.44.2R were 
used as reverse primers in all forward/reverse combinations. 
Only in the reactions containing Al.44.1 or Al.44.2 as the 
forward primers combined with A2.3R as the reverse primer was a 
non-art if actual amplified product seen, of estimated size 
(mobility on agarose gels) of 1400 base pairs. The order of the 
primers used to obtain this amplification product indicates that 
the peptide fragment TcdAii-PK44 lies amino-proximal to the 58 
kDa peptide fragment of TcdAiii. 

The 1400 bp PCR products were ligated to the plasmid vector 
pCR"-!! according to the supplier's instructions. The DNA 
sequences across the ends of the insert fragments of four 
isolates were determined using primers similar in sequence to the 
supplier's recommended primers and using sequencing methods 
described previously. The nucleic acid sequence of all isolates 
differed as expected in the regions corresponding to the 
degenerate primer sequences, but the amino acid sequences deduced 
from these data were the same as the actual amino acid sequences 
for the peptides determined previously. (SEQ ID N0S:41 and 39). 

Screening of the W-14 genomic cosmid library as described in 
Example 8 with a radiolabeled probe comprised of the DNA 
prepared above (SEQ ID NO:36) identified five hybridizing cosmid 
isolates, namely 17D9. 20B10. 21D2, 27B10, and 2601. These 
cosmids were distinct from those previously identified with 
probes corresponding to the genes described as SEQ ID NO: 11 or 
SEQ ID NO: 25. Restriction enzyme analysis and DNA blot 
hybridizations identified three EcoR I fragments, of approximate 
sizes 3.7, 3.7, and 1.1 kbp. that span the region comprising the 
DNA of SEQ ID NO: 36. Screening of the W-14 genomic cosmid 
library using as probe the radiolabeled 1 . 4 kbp DNA fragment 
prepared in this example identified the same five cosmids (i7D9. 
20B10. 21D2. 27B10. and 26D1) . DNA blot hybridization to EcoR I- 
digested cosmid DNAs also showed hybridization to the same subset 
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of EcoR I fragments as seen with the 2.5 kbp TcdAii gene probe, 
indicating that both fragments are encoded on the genomic DUA. 

DNA sequence determination of the cloned EcoR I fragments 
revealed an uninterrupted reading frame of 7 551 base pairs (SEQ 
5 ID NO: 46), encoding a 232.9 kDa protein of 2516 amino acids (SEQ 
ID NO:47), Analysis of the amino acid sequence of this protein 
revealed all expected internal fragments of peptides TcdAii (SEQ 
ID N0S:17, 18, 37, 38 and 39) and the TcdAiii peptide N-cerminus 
(SEQ ID NO: 41) and all TcdAiii internal peptides (SEQ ID NOS:42 
10 and 43). The peptides isolated and identified as TcdAii and 
TcdAiii are each products of the open reading frame, denoted 
CcdA, disclosed as SEQ ID NO:46. Further, SEQ ID NO:47 shows, 
starting at position 89, the sequence disclosed as SEQ ID NO: 13, 
which is the N-terrainal sequence of a peptide of size 
15 approximately 201 kDa, indicating that the initial protein 

produced from SEQ ID No: 46 is processed in a manner similar to 
that previously disclosed for SEQ ID NO: 12. In addition, the 
protein is further cleaved to generate a product of size 209.2 
IcDa, encoded by SEQ ID NO: 48 and disclosed as SEQ ID NO: 49 
20 (TcdAii peptide), and a product of size 63.6 kDa, encoded by SEQ 
ID NO:50 and disclosed as SEQ ID N0:51 (TcdAiii peptide). Thus, 
it is thought that the insecticidal activity identified as toxin 
A (Example 15) derived from the products of SEQ ID NO: 46, as 
exemplified by the full-length protein of 282,9 kDa disclosed as 
25 SEQ ID NO: 47, is processed to produce the peptides disclosed as 
SEQ ID NOS:49 and 51. It is thought that the insecticidal 
activity identified as toxin B (Example 15) derives from the 
products of SEQ ID NO: 11. as exemplified by the 280.6 kDa protein 
disclosed as SEQ ID NO: 12. This protein is proteolytically 
30 processed to yield the 2 07.6 kDa peptide disclosed as SEQ ID 

NO: 53, which is encoded by SEQ ID NO: 52, and the 62,9 kDa peptide 
having N-terminal sequence disclosed as SEQ ID NO: 40. and further 
disclosed as SEQ ID NO: 55, which is encoded by SEQ ID NO: 54. 
Amino acid sequence comparisons between the proteins 
35 disclosed as SEQ ID NO: 12 and SEQ ID NO: 47 reveal that they have 
69% similarity and 54% identity. This high degree of 
evolutionary relationship is not uniform throughout the entire 
amino acid sequence of these peptides, but is higher towards the 
carboxy- terminal end of the proteins, since the peptides 
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disclosed as SEQ ID NO: 51 (derived from SEQ ID HO: 47) and SEC- ID 
NO: 55 (derived from SEQ ID HO: 12) have 76% simiiaricy and 641 
identity. 
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Example 18 

Control of European Cornborer - Induced Leaf Damac^e on Maize Pi;. nr« 
by Spray Application of Photorhabdus (Strain W-14) Brofh 

The ability of Phocorhabdus toxin (s) to reduce plant damage 
caused by insect larvae was demonstrated by measuring leaf damage 
caused by European com borer iOscriaia nubilalxs) infested onto 
maize plants treated with Phocorhabdus broth. Fermentation broth 
from Phocorhabdus strain W-14 was produced and concentrated 
approximately 10-fold using ultrafiltration (10.000 MW pore-sizel 
as described in Example 13. The resulting concentrated broth was 
then filter sterilized using 0.2 micron nitrocellulose membrane 
filters. A similarly prepared sample of uninoculated 2% proteose 
peptone #3 was used for control purposes. Maize plants (a 
DowElanco proprietary inbred line) were grown from seed to 
vegetative stage 7 or 8 in pots containing a soilless mixture in 
a greenhouse (27oc day; 22<>C night, about 50%RH. 14 hr day- 
length, watered/fertilized as needed) . The test plants were 
arranged in a randomized complete block design (3 reps /treatment. 
6 plants/treatment) in a greenhouse with temperature about 22°c 
day; 18»C night, no artificial light and with partial shading, 
about 50%RH and watered/ fertilized as needed. Treatments 
(uninoculated media and concentrated Phocorhabdus broth) were 
applied with a syringe sprayer, 2.0 mis applied from directly 
(about 6 inches) over the whorl and 2.0 additional mis applied in 
a circular motion from approximately one foot above the whorl. 
In addition, one group of plants received no treatment. After 
the treatments had dried (approximately 30 minutes), twelve 
neonate European corn borer larvae (eggs obtained from commercial 
sources and hatched in-house) were applied directly to the whorl. 
After one wee)c. the plants were scored for damage to the leaves 
using a modified Guthrie Scale (Koziei. M. G.. Beland. G. L.." 
Bowman. C, Carozzi. N. B., Crenshaw, R. . Crossland. L.. Dawson, 
J.. Desai. M., Hill, m.. Kadweil, s.. Launis, K. . Lewis, K., 
Maddox. D., McPherson, K. , Meghji. M. z.. Merlin. E., Rhodes. R., 
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Warren. G. w. . wrighc. M. and Evoia, s. V. 1993). 

Bio/Technology, 11. 194-195.) and the scores were compared 
statistically [T-test (LSD) p<0.05 and Tukey* s Studentized Range 
(HSD) Test p<0.1J. The results are shown in Table 23. For 
5 reference, a score of 1 represents no damage , a score of 2 

represents fine "window pane" damage on the unfurled leaf with no 
pinhole penetration and a score of 5 represents leaf penetration 
with elongated lesions and/or mid rib feeding evident on more 
than three leaves (lesions < 1 inch). These data indicate that 
10 broth or other protein containing fractions may confer protection 
against specific insect pests when delivered in a sprayable 
formulation or when the gene or derivative thereof, encoding the 
protein or part thereof, is delivered via a transgenic plant or 
microbe. 

15 

Table 28 

Effect of Photorhabdus Culture Broth on 
European Corn Borer- Induced Leaf Damage on Maize 

20 Traatsient Average Guthrie Score 

No Treatment 5.02* 
Uninoculated medium 5.15* 

Photorhabdus Broth 2.24^ 

Means with different letters are statistically different 
25 (p<0.05 or p<0. 1) . 

Example 19 

Genetic Engineering of Genes for Expression in £. coii 

« 

30 Summary of constructions 

A series of plasmids were constructed to express the ccbA 
gene of Photorhabdus w-14 in Escherichia coli. A list of the 
plasmids is shown in Table 29. A brief description of each 
construction follows as well as a summary of the E. coli 

35 expression data obtained. 
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Table 2 9 

Expression piasmids for the ccbA gene. 



Piasmid 


Gene 


Vector /Select ion 


Compartment 










PDAB63 4 


CCbA 


pBC/Chl 


Intracellular 


PACGP67B/ CCbA 


CCbA 


pAcGP67B/ Amp 


Baculovirus , 
secreted 


PDAB63 5 


CCbA 


pET27b/Kdn 


Periplasm 


pETlS- CCbA 


CCbA 


p£T15-ccjbA 


Intracellular 



Abbreviations: Kan=kanamycin, Chlschloramphenicol. Amp=ampicilii 



Construction of pDAB634 

In Example 9. a large EcoR I fragment which hybridizes to 
the TcbAii probe is described. This fragment was subcloned into 

pBC (Stratagene, La Jolla CA) . Sequence analysis indicates that 
10 this fragment is 8816 base pairs. The fragment encodes the ccbA 
gene with the initiating ATG at position 571 and the terminating 
TAA at position 8086. The fragment therefore carries 570 base 
pairs of Phocorhabdus DNA upstream of the ATG and 73 0 base pairs 
downstream of the TAA. 

15 

Construction of Piasmid pAcGP67B/ ccbA 

The CcbA gene was PCR amplified using the following primers; 
5* primer (SlAcSl) 5* TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA 
TC 3' and 3' primer (SlAc31) 5' TTT AAA GCG GCC GCT TAA CGG ATG 

20 GTA TAA CGA ATA TG 3 ' . PCR was performed using a TaKaRa LA PCR 
kit from PanVera (Madison, Wisconsin) in the following reaction: 
57.5 ml water, 10 mi lOX LA buffer, 16 ml dNTPs (2.5 mM each 
stock solution). 20 ml each primer at 10 pmoles/ml. 300 ng of the 
piasmid pDAB634 containing the w-14 ccbA gene and one ml of 

25 TaKaRa LA Taq polymerase. The cycling conditions were 98**C/20 

sec, 680C/5 min, 72«C/10 min for 30 cycles, A PCR product of the 
expected about 7526bp was isolated in a 0.8% agarose gel in TBE 
(100 mM Tris, 90 mM boric acid, 1 mM EDTA) buffer and purified 
using a Qiaex II kit from Qiagen (Chatsworth, California). The 

30 purified ccbA gene was digested with Nco I and Not I and ligated 
into the baculovirus transfer vector pAcGP67B (PharMihgen (San 
Diego, California)) and transformed into DH5a £. coll. The tcbA 
gene was then cut from pAcGP67B and transferred to pET27b to 
create piasmid pDAB635. A missense mutation in the tcbA gene was 

35 repaired in pDAB63 5. 
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The repaired cchA gene concains two changes from che 
sequence shovm in Sequence ID NO: 11; an A>G at 212 changing an 
asparagine 7i co serine 71 and a G.-A at 229 changing an alanine 
77 to threonine 77. These changes are both upstream of the 
5 proposed TcbAii N-termmus. 

Construction of pET15*ccbA 

The ccbA coding region of pDAB635 was transferred to vector 
pETlSb. This was accomplished using shotgun ligations, the DMAs 
10 were cut with restriction enzymes Nco I and Xho I. The resulting 
recombinant is called pET15- ccbA. 

Expression of TcbA in E, coll from plasmid pET15-ccbA 

Expression of tcJhA in E. coll was obtained by modification 
15 of the methods previously described by Studier ec al. (Studier, 
F.W., Rosenberg, A., Dunn., J., and Dubendorff, J.. (1990) Use of 
T7 RNA polymerase to direct expression of cloned genes. Methods 
Enzymol., 185: 60-89.). Competent £. coll cells strain BL21(DE3) 
were transformed with plasmid pET15-tcbA and plated on LB agar 

20 containing 100 jig/ml ampicillin and 40 mM glucose. The 

transformed cells were plated to a density of several hundred 
isolated colonies/plate. Following overnight incubation at 37oc 
the cells were scraped from the plates and suspended in LB broth 
containing 100 jig /ml ampicillin. Typical culture volumes were 

25 from 200-500 ml. At time zero, culture densities (OD600) were 

from 0.05-0.15 depending on the experiment. Cultures were shaken 
at one of three temperatures {22^C, 30^C or 37^0 until a density 
of 0.15-0.5 was obtained at which time they were induced with 1 
mM isopropylthio-P-galactoside (IPTG). Cultures were incubated 

30 at the designated temperature for 4-5 hours and then were 
transferred to 4<^C until processing (12-72 hours). 

Purificatio n and characterization of TcbA expressed in E.coll 

from Plasmid pET15-ccbA. 
35 £. coll cultures expressing TcbA peptides were processed as 

follows. Cells were harvested by centrif ugation at 17,000 x G and 

the media was decanted and saved in a separate container. 

The media was concentrated about 8x using the M12 (Amicon, 

Beverly MA) filtration system and a 100 kD molecular mass cut-off 
*0 filter. The concentrated media was loaded onto an anion exchange 
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coi'oinn and che bound proteins were eiuced with i.O M IlaCi. Th^? 
1.0 M NaCi eiucion peak was found co cause morcality againsc 
Southern corn roocworm (SCR) larvae Table 30). The i.O M NaCl 
fraction was dialyzed against 10 mM sodium phosphate buffer pH 
7.0. concentrated, and subjected to gel filtration on Sepharose 
CL-4B (Pharmacia, Piscataway, New Jersey). The region of the CL- 
4B elution profile corresponding to calculated molecular weight 
(about 900 kDa) as the native W-14 toxin complex was collected, 
concentrated and bioassayed against larvae. The collected 900 
kDa fraction was found to have insect icidal activity (see Table 
30 below) , with symptomology similar to that caused by native w- 
14 toxin complex. This fraction was subjected to Proteinase K 
and heat treatment, the activity in both cases was either 
eliminated or reduced, providing evidence that the activity is 
proteinaceous in nature. In addition, the active fraction tested 
immunologically positive for the TcbA and TcbAni peptides in 
immunoblot analysis when tested with an anti-TcbAiii monoclonal 
antibody (Table 30). 



20 



Table 3 0 

Results of Immunoblot and SCR Bioassays. 



SCR Activity 



% 

Mortality 



% Growth 
Inhibit . 



Immunoblo t 
Peptides 



Native Size 
[CL-4B 
Estimated 
Size] 



TcbA Media i.O M 



4- >♦••♦• 



4- 



TcbA 



Ion Exchange 



TcbA Media CL-4B 



TcbA, 
TcbA 



-900 kDa 



111 



TcbA Media CL-4B 
* Proteinase K 



NT 



TcbA Media CL-4B 
+ heat treatment 



NT 



TcbA Cell Sup CL-4B 



-900 kD 



25 



PK = Proteinase K treatment 2 hours; Heat treatment = lOO^C for 10 
minutes; ND = None Detected; NT = Not Tested. Scoring system for 
mortality and growth inhibition as compared to control samples; 5- 
2 4%=- + , 25-49%="+*" , 50-100%=" . 



The cell pellet was resuspended in 10 mM sodium phosphate 
buffer. pH=7.0, and lysed by passage through a Bio-Neb"* cell 
30 nebulizer (Glas-Col Inc., Terra Haute, IN). The pellets were 
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created with DMase co remove DMA and cencrifuged ac i7,0C0 x g , 
separate the cell pellet from the cell supernatant. The 
supernatant fraction was decanted and filtered through a 0.2 
micron filter to remove large particles and subjected to anion 
e.xchange chromatography. Bound proteins were eluted with 1.0 M 
NaCl, dialyzed and concentrated using Biomax'" (Millipore Corp, 
Bedford. MA) concentrators with a molecular mass cut-off of 
50,000 Daltons. The concentrated fraction was subjected to gel 
filtration chromatography using Sepharose CL-4B beaded matrix. 
Bioassay data for material prepared in this way is shown in Tab! 
30 and is denoted as " TcbA Cell Sup". 

In yet another method to handle large amounts of material, 
the cell pellets were re-suspended in 10 mM sodium phosphate 
buffer, pH = 7.0 and thoroughly homogenized by using a Kontes 
Glass Company (Vineland. NJ) 40 ml tissue grinder. The cellular 
debris was pelleted by centrifugation at 25,000 x g and the cell 
supernatant was decanted, passed through a 0.2 micron filter and 
subjected to anion exchange chromatography using a Pharmacia 
10/10 column packed with Poros HQ 50 beads. The bound proteins 
were eluted by performing a NaCl gradient of 0.0 to 1.0 M. 
Fractions containing the TcbA protein were combined and 
concentrated using a 50 kDa concentrator and subjected to gel 
filtration chromatography using Pharmacia CL-4B beaded matrix. 
The fractions containing TcbA oligomer, molecular mass of 
approximately 900 kDa, were collected and subjected to anion 
exchange chromatography using a Pharmacia Mono Q 10/10 column 
equilibrated with 20 mM Tris buffer pH = 7.3. A gradient of 0.0 
to 1.0 M NaCl was used to elute recombinant TcbA protein. 
Recombinant TcbA eluted from the column at a salt concentration 
of approximately 0.3-0.4 M NaCl, the same molarity at which 
native TcbA oligomer is eluted from the Mono Q 10/10 column. The 
recombinant TcbA fraction was found to cause SCR mortality in 
bioassay experiments similar to those in Table 30. 
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li) APPLICANT: Ensign, Jerald C 
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f f rench-Conscant . Richard 
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Roberts, Jean L 
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(ii) TITLE OF Iirv'OmOM: Inseccicidal Protein Toxins From 

Phocorhabdus 

fiiiJ NUMBER OF SEQUENCES: 61 

25 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Quarles & Brady 

(B) STREET: i South Pinckney Street 

(C) CITY: Madison 
ID) STATE: WI 

-'O (E) COUNTRY: US 
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(v) COMPUTER READABLE FORM: 

«A) MEDIUM TYPE: Floppy disk 
'B' COMPUTER: IBM PC compacible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTVJARE: Patentin Release #1.0, Version «1 30 



20 



40 



50 



55 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(Vii) PRIOR APPLICATION DATA: 
^ <A> APPLICATION NUMBER: US 08/063,615 

(B) FILING DATE: 18 -MAY- 1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION. NUMBER: US 03 '395 497 

(B) FILING DATE: 28-FEB-1995 
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<A) APPLICATION NUMBER: US 60/007,255 
(B) FILING DATE: 06 -NOV- 1995 
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•vii} PRIOR ARPLICATICN DATA: 

?A) APPLICATIOM tTUMBER: US 08/705,484 
(B) FILING DATE: 23-AUG-1996 



iviii) ATTORNEY /AGENT INF0R14ATI0N : 

(A) NAME: Seay . Nicholas J 

(B) REGISTRATION NUMBER: 27386 

(CI REFERENCE /DOCKET NUMBER: 960296.93804 

(i:<) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 608-251-5000 

(B) TELEFAX: 608-251-9166 



(2) INFORMATION FOR SEQ ID NO:i: 



(i) SEQUENCE CHARACTERISTICS: 

tA) LENGTH: 11 amino acids 
20 (BJ TYPE: amino acid 

(C) STPANDEDNESS: 

(D) TOPOLOGY: linear 



fii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn 
1 5 10 



35 l2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
•^0 (C> STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
•*5 (V) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

50 Met Gin Asp Ser Pro Giu Val Ser He Thr Thr Trp 

i 5 10 



(2) INFORMATION FOR SEQ ID NO: 3: 

5^ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

^0 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(V) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 
Leu Val Ala 



15 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
20 { C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
25 (v) FRAGMENT TYPE: N-terminal 



30 



35 



45 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

40 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Ala Gly Asp Thr Ala Asn lie Gly Asp 
50 1 5 



(2) INFORMATION FOR SEQ ID NO: 6: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



117 



wo 97/1 7432 



PCT/US96/18003 



10 



20 



30 



40 



50 



60 



r/.» FRAGMEI^JT TYPE: H-cerminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Leu Giy Giy Ala Ala Thr Leu Leu Asp Leu Leu Leu Pro Gin I'^ 
^5 10 



(2) INFORMATION FOR SEQ ID N0:7: 



{iJ SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 11 amino acids 
'5 (B) TYPE: amino acid 

(C) STRAI^JDEDNESS : 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: N- terminal 



^^^^ SEQUENCE DESCRIPTION: SEQ ID NO: 7; 

Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 
^ 5 10 



(2) INFORtdATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHAPACTERISTICS: 
(A) LENGTH: 9 amino acids 
*B) TYPE: amino acid 

iC) STRAIIDEDNESS : 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Asn Leu Ala Ser Pro Leu He Ser 
i 5 



(2) INFORMATION FOR SEQ ID NO: 9: 



li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



lii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE; N- terminal 



IS 
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SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



Met lie Asn Leu Asp lie Asn Giu Gin Asn Lys lie Met Val Val s^r 
i 5 10 i5 - 



i2) Il^FORMATION FCP. SEQ ID NO: 10: 



(i) SEQUENCE CHAPACTERISTICS: 
It* (A) LENGTH: 20 amino acids 

(B) TVPE: amino acid 

(C) STPJUJDECNESS : 
(D» TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: protein 

(V) FRAGMENT TYPE: N- terminal 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ala Ala Lys Asp Val Lys Phe Gly Sei Asp Ala Arg Val Lj 



i 5 10 



Lys Met Leu 
15 



25 Arg Giy Val Asn 

20 



l2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: i..7515 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG CAA AAC TCA TTA TCA AGC ACT ATC GAT ACT ATT TGT CAG AAA CTG 
Met Gin Asn Ser Leu Ser Ser Thr He Asp Thr lie Cys Gin Lys Leu 
^5 10 15 



43 



C?JK TTA ACT TGT CCG GCG GAA ATT GCT TTG TAT CCC TTT GAT ACT TT^ 96 
Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 

20 25 30 



CGG GAA AAA ACT CGG GGA ATG GTT AAT TGG GGG GAA GCA AAA CGG ATT 144 

33 Arg Giu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg Ho 

35 40 45 

TAT GAA ATT GCA CAA GCG GAA CAG GAT AGA AAC CTA CTT CAT GAA AAA 192 
Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
nU 50 55 60 

CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA MC GCT GTT CGG TTG 240 
Arg lie Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
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GGT ACC CC3G CAA ATG TTG GGT TTT ATA CAA GOT TAT ACT GAT CTG ttt 2flq 
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp LeJ 

35 90 95 

OCT AAT CGT GCT OAT ?AC TAT GCC GCG CCG GCC TCG CTT GCA TCG ATC. ■( ^ - 
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser vH Ala Sr M« ' 

TTC TCA CCG GCG GCT TAT TTG ACG GK\ TTG TAC CGT GAA GCC Ai^A AAC 3 ad 
Phe ser Pro Ala Ala l^-r Leu Thr Gla Leu Arg Clu Ala Asn 

X 2 0 



15 



20 



25 



35 



40 



45 
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TTG CAT GAC AGC AGC TCA ATT TAT TAC CTA GAT AAA CGT CGC CCG G'T 
Leu Hxs ASP ser Ser Ser lie TVr T-yr Leu Asp Lys Arg 1% Pro As^p 

^^jj y 

TTA GCA AGC TTA ATG CTC AGC CAG AAA AAT ATG GAT GAG GAA AT^ T'^ JAO 
Leu Ala ser Leu Met Leu Ser Gin Lys Asn Mec Asp Glu gJC ler 

ISO ^.^ 

ilr 71!:, ?r f^^ '■'''^ -^-^ "^^ ^"^ GGG ATC GAA AC A 523 

Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr l"'s 

170 

AC A GGA AAA TCA CAA GAT GAA GTG ATG GAT ATG TTG TCA ACT T^"^ CGT S7fi 
Thr Gly Lys ser Gin Asp Glu Val Met Asp Met Leu SeV ihr Arg 
30 185 190 

TTA AGT GGA GAG ACA CCT TAT CAT CAC GCT TAT GAA ACT GTT rrT rii 

Leu ser Gly Glu Thr Pro lyr His His All Jyr rll vH Ar^ ITu 

200 205 



55 



60 



65 



ill -IT wtl n"^ f'''' "'^'^ '"''^ '^AT TTG TCA CAG GCA CCC 672 

lie val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 

215 220 

?II 17 ^'"'^ ^"^ TTC rrc GGT ATT AGC TCC 720 

lie Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser S^r 

"0 235 240 

CAT ATT TCG CCA GAA CTC TAT AAC TTC CTC ATT GAG GAG ATC CCG GAA 
HIS lie Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 

245 250 255 

AAA GAT GAA GCC GCG CTT GAT ACG CTT TAT AAA ACA AAC TTT GGC G^^T 316 

Lys ASP Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phi G?y Asp 



63 



50 265 270 



?n ril 17 ■'^'^ -"^^ ^^"^ TAT TAT 364 

lie Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tvt Tyr 

280 285 

gS 7J{ lit °AT ATI GCC TAC GTC ACG ACT TCA TTA TCA CAT 912 

' .ti ^^'^ Ala Tyr Val Thr Thr Ser Leu Ser His 

295 300 



vIT f ^ ^^"^ ^'^ ATT CCG TTG GTC GAT GGT GTG 960 

val Gly Ti-r Ser Ser Asp He Leu Val He Pro Leu Val Asp Glv Val 

315 * 320 

gTv M^r r^M ^'■''t f T f"" '^''T ""^^ ^""^ -'^^ "^^^ AAT TAT 1003 

L,s Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 

325 330 335 
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S ser ^ S T^r S l'^ V''' ''^'^ "^"^"^ -^"^^ 

lar ^""^ Asp Asn 

^ I"^! r° Tr ^ "^^"^ -"^"^ ^"^-^ •^<=<^ AGT rrr ggt tto gat ^-at t-tt , , - 

r.r Leu ne Lys ryr Asn Leu 3er Asn Ser ?he Gly ^ ^ 



380 

AAT CCC TAT CCT GAT ATG GTC ATA AAT '"AA AAr tlt ^ 

,j «n P„ „p JiJ JJ- T« CJ= ,,00 



390 39c 

-^^^ 400 



^ Z Sr Aso J^n' t^'^^ 'J' 

fn^ ^ ^^''^ ^^'^ lie Gly Leu Gin 

20 ^ 415 

AGA TGG CAT AGC GGT AGT TAT AAT TTT GCC C'C GCC TTT iii , - 

Arg Trp His Ser Giy Ser TVr Asn Phe Ala Ala AU Asn nl 



25 



45 



60 



fi5 



^ - - - s s ji: - - - - ill 



440 445 



3.. s I- ^ s: - - - s ™ - s Jii 
3, i z ;iT - - - s 5 5it 

* ° ^■'S 430 

■^^J vTi Sr Arg 51^ Til ^7 f ^ '^^^ ATC AGT GAA 1483 

lyr Arg \al Lys Phe Tyr He Asp Arg ryr Gly lie Ser Glu 

40 450 495 

ctu ?hr f?"^ "^-"^ ATT AAT ATC TCT CAG CAA GCT GTT 153 6 

Glu Thr Ala Ala He Leu Ala Asn He Asn lie Ser Gin Gin Ala vll 



505 510 



t^n ctn f"" f ^t"" ^AA CTA TTT .AAT CAC CCG CCG CTC 1584' 

Gl. Asn Gin Leu Ser Gin Phe Giu Gin Leu Phe Asn His Pro Pro Leu 

520 525 

50 As^ Sy iT, Sg ^r G^J rll ^ ^^T CTT CCT 1632 

^ly lie Arg Tyr Glu lie Ser Glu Asp Asn Ser Lys His Leu Pro 

535 540 

.^n Pro til ^ JJ^ f" l^"" "'^^ ^^"^ ACC GGT GAT GAT CAA CGC 1680 

55 545 ^ " ccn Thr Gly Asp Asp Gin Arg 

555 

J^s Ma SIT r-'' ^AG GTT .AAC GCC AGT GAG TTG TAT 1723 

L.s Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Giu Leu TVr 

56i> 570 575 ' 

ctn m"^? ""^^ GAT CGT AAA GAA CAC GGT GTT ATC AAA AAT 17Tfi. 

Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly vIT ?]; CjJ J^J ' " " 

565 590 

.^-^C TTA GAG AAT TTG TCT GAT CTG TAT TTG GTT AGT TTG CTG crr rar ifl7^ 
Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val sTr L^u Ma Oln 
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55 



60 



600 605 

ATT CAT AAC CTG ACT ATT GCT GAA TTG AAC ATT TTX; TTG GTG ATT TGT 1372 
lie His Asn Leu Thr lie Ala Giu Leu Asn lie Leu Leu Val lie 
610 615 620 

GGC TAT GGC GAC ACC AAC ATT TAT GAG ATT ACC GAG GAT AAT TTA GC* 1920 
Gly Tyr Gly Asp Thr Asn lie Tyr Gin lie Thr Asp Asp Asn Leu Aia 

630 635 

A.:^:. ATA GTG GAA AC A TTG TTG TGG ATC ACT CAJK TGG TTG >JiO ACC CAA U63 
Lys lie Vai Giu Thr Leu Leu Trp lie Thr Gin Trp Leu Lys Thr Gin 

645 650 655 

.AA.A TGG ACA GTT ACC GAC CTG TTT CTG ATG ACC ACG GCC ACT TAC AGC 2016 
Lys Trp Thr Vai Thr Asp Leu Phe Leu Met Thr Thr Aia Thr Tyr Ser 

660 665 670 

ACC ACT TTA ACC CCA GPA ATT AGC .\AT CTG ACG GCT ACG TTG TCT TCA 2064 
Thr Thr Leu Thr Pro Giu lie Ser Asn Leu Thr Ala Thr Leu Ser Ser 
3*^5 680 685 



ACT TTG CAT GGC AAA GAG AGT CTG ATT GGG CAA GAT CTG AAA AGA GCA 2ii2 
Thr Leu His Gly Lys Giu Ser Leu lie Giy Giu Asp Leu Lys Arg Aia 
690 695 700 



ATG GCG CCT TGC TTC ACT TCG GCT TTG CAT TTG ACT TCT CAA GAA GTT 2160 
Met Aia Pro Cys Phe Thr Ser Aia Leu His Leu Thr Ser Gin Giu Val 

'10 715 720 

GCG TAT GAC CTG CTG TTG TGG ATA GAC CAG ATT CkA CCG GCA CAA ATA 2208 

Ala T*/r Asp Leu Leu Leu Trp lie Asp Gin lie Gin Pro Ala Gin lie 

"25 730 735 

ACT GTT GAT GGG TTT TGG GAA GAA GTG CAA ACA ACA CCA ACC AGC TTG 2256 
Thr Vai Asp Giy Phe Trp Giu Giu Val Gin Thr Thr Pro Thr Ser Leu 

"^40 745 750 

AAG GTG ATT ACC TTT GCT CAG GTG CTG GCA CAA TTG AGC CTG ATC TAT 2304 
Lys Vai He Thr Phe Ala Gin Val Leu Aia Gin Leu Ser Leu Ii« T/r 
"755 760 765 



CGT CGT ATT GGG TTA AGT GAA ACG GAA CTG TCA CTG ATC GTG ACT CAA 23 52 
Arg Arg He Giy Leu Ser Giu Thr Giu Leu Ser Leu He Val Thr Gin 
^5 770 775 730 



TCT TCT CTG CTA GTG GCA GGC AAA AGC ATA CTG GAT CAC GGT CTG TTA 2400 
Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Giy Leu Leu 

790 795 300 

ACC CTG ATG GCC TTG GAA GGT TTT CAT ACC TGG GTT AAT GGC TTG GGG 2443 
Thr Leu Met Aia Leu Giu Gly Phe His Thr Trp Val Asn Giy Leu Giy 

805 810 815 

CAA CAT GCC TCC TTG ATA TTG GCG GCG TTG kPJi GAC GGA GCC TTG ACA 2496 
Gin His Aia Ser Leu lie Leu Aia Ala Leu Lys Asp Giy Ala Leu Thr 

820 825 830 

GTT ACC CAT GTA GCA C.VA GCT ATG AAT AAG GAG GAA TCT CTC CTA CAA 2 54 4 

Val Thr Asp Val Ala Gin Aia Met Asn Lys Giu Giu Ser Leu Leu Gin 
835 340 845 

ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG ACC AGT TGG 2592 

Met Ala Aia Asn Gin Vai Giu Lys Asp Leu Thr Lys Leu Thr Ser Trp 
350 855 360 
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AC A CAG ATT GAC GCT ATT CTG C^J\ TGG TTA CAG ATG TCT TCG GCC TTG 2 64 0 
Thr Gin lie Asp Ala lie Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 
865 370 375 38O 

5 GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2683 
Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Giy 

885 390 395 

ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 2*3 6 
10 He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 

300 905 9iO 

GCT GAT CAT GCT PJkT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC AGT 27 34 
Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 
15 915 920 925 



20 



A.AG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT 3TC GAT AGT GCT GCT 2832 
Lys Ala Leu Cys Asn Tyr T^-r He Asn Ala Val Val Asp Ser Ala Ala 
930 935 940 

GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT VTG CTG ATT GAT AAT 2330 
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr T/r Leu Leu He Asp Asn 
945 950 955 960 

25 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2928 
Gin Val Ser Ala Asp Val lie Thr Ser Arg lie Ala Glu Ala He Ala 

965 970 975 

GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2976 
3U Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 

980 985 990 

CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3024 
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
35 995 1000 1005 



40 



60 



TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT CTC TCT GAA CTG GTC TAT 3072 

Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly '/al Ser Glu Leu Val Tyr 
1010 1015 1020 

TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3120 

Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg lie Gly Gin Thr Lys 
i025 1030 1035 1040 



45 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 3163 

Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 

1045 1050 1055 

GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3216 

50 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 

1060 1065 1070 

GTA GCA AAT CTG AAA GTA ATT AGT GCT TAC CAC GAT AAT GTG AAT GTG 3264 

Val Ala Asn Leu Lys Val He Ser Ala Ti'r His Asp Asn Val Asn Val 

55 1075 1080 1085 



GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3 312 

Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 
1090 1095 1100 

ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 5 3 60 

Thr T-yr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 1110 1115 H20 



65 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC J 4 03 

Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
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65 



.V.T CCT TGG AAA AftT ATC ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 345.; 
Asn Pro Trp Lys Asn lie lU Ara Pro Val Val Tyr Kec Ser Ara L^u 

H45 1^50 - 

TAT CTG CTA TGG CTG GAG CAG CAA TCA AAG AAA ACT GAT CAT GGT A.a:v 3 504 
Ti. r Leu LGu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
1155 1160 1165 

Thr ^h. tV l"'^ ^'^ I:^"^ AAC TTA AAA CTG OCT CAT ATT CGT TAC GAC 3 55 
Thr ihr He T-yr Gin Tyr Asn Leu Lys Leu Ala His He Aro Tyr Asd 
11"0 1175 1180 " 

GGT AGT TGG AAT ACA CCA TTT ACT TTT GAT GT3 ACA GAA AAG GTA AAA 3 60C 
Gly ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val L^ 
^^^5 1190 1195 1200 

l^"^ '^'^^ ^^"^ ^^"^ ^^'^ GAA TCT TTA GGG TTO TAT TCT 3643 

Asn Ti-r Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cvs 

1205 1210 1215 



ACT GOT TAT CAA GGG GAA GAC ACT CTA TTA GTT ATC TTC TAT TCG ATG 3696 
Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 

^220 1225 1230 



CAG AGT AGT. TAT AGC TCC TAT ACC GAT AAT AAT GCG CCG GTC ACT GGG 374.; 
Gin Ser Ser Tyr Ser Ser T/r Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 

CTA TAT ATT TTC GCT GAT ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3792 
Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 

^f "^ ti"^ "^^"^ CAA TTT GAT ACT GTG ATG 3340 

Ala Thr Asn Ti'r Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
^^^^ 1270 1275 1280 

GCA GAT CCG GAT AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT .\AT 3888 
Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 

1285 1290 1295 

AAC CGT TAT GCG GAG GAT TAT GAA ATT CCT TCC TCT GTG ACA AGT AAC 393 5 
Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 

1300 1305 1310 

AGT AAT TAT TCT TGG GGT GAT CAC ACT TTA ACC ATG CTT TAT GGT GGT 3 984 
Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Ti'r Gly Gly 
1315 1320 1325 

AGT GTT CCT AAT ATT ACT TTT GAA TCG GCG GCA GAA GAT TTA AGG CTA 4032 
ser val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 ^3^Q 

55 TCT ACC A^T ATG GCA TTG AGT ATT ATT CAT AAT GGA TAT GCG GCA ACC 4080 

11^=; ^^"^ °ly Tyr Ala Gly Thr 

1350 1355 1360 

CGC CGT ATA CAA TGT AAT CTT ATG AAA CAA TAC CCT TCA TTA GGT GAT 4128 
Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 

136.5 1370 1375 

AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC CGT TTT AAT 4176 
Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 

1380 1385 1390 
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CTG GTG CCA rrz TTT TTC GCA .AA.^ GAC GAG AAC TCA GAT GAT ACT \Zi\ 

Leu Vai Pro Leu Phe Lys Phe Giy Lys Asp Giu Asn Ser Asp Asp • Ser 
1395 UOO 1405 

5 ATT TGT ATA TAT ,AAT G.AA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 42"- 
Ile Cys lie T:,'r Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp T/r 
1410 1415 1420 

TTT TCT TCG AAA GAT GAC A-AT AAA AC A GCG GAT TAT A-AT GGT GGA ACT 432 • 
II* Phe :ier Ser Lys Asp Asp Asn Lys Thr Ala Asp T/v Asn Gly Gly Thr 
142S 1430 1435 1440 

C.AA TGT ATA GAT GCT GGA ACC AGT AAC AAA GAT TTT TAT TAT AAT CTC 4?6 
Gin Cys lie Asp Ala Giy Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
15 1445 1450 1455 

CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 441-» 

Gin Glu lie Glu Val lie Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 

1460 1465 1470 



20 



41) 



60 



AAA ATA TCC AAC CCC ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 4461 
Lys lie Ser Asn Pro lie Asn lie Asn Thr Gly lie Asp Ser Ala Lys 
1475 1480 1485 



23 GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 451 J 

Val Lys Vai Thr Vai Lys Ala Gly Gly Asp Asp Gin lie Phe Thr Ala 
1490 1495 1500 

GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 456v 
30 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Giu Giu 
1505 1510 1515 1520 

ATG ATT TAT CAG TTC AAT AAC CTG AC A ATA GAT TGT AAG AAT TTA AAT 460r^ 
Met lie Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
35 1525 1530 1535 

TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 46r-3 
Phe lie Asp Asn Gin Ala His He Glu lie Asp Phe Thr Ala Thr Ala 

1540 1545 1550 



CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 47C4 
Gin Asp Gly Arg Phe Leu Gly Ala Giu Thr Phe He lie Pro Vai Thr 
1555 1560 1565 



45 AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 4752 

Lys Lys Val Leu Gly Thr Glu Asn Val lie Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 

AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4300 
50 Asn Gly Val Gin Tyr Met Gin lie Gly Ala Tyr Arg Thr Arg Leu Asn 
1535 1590 1595 1600 

ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4343 
Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Giy He Asp 
55 1605 1610 1615 

GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4 896 

Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 

1620 1625 1630 



GCG GGC AC A TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4944 
Ala Gly Thr Tyr Vai Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser lie 
1635 1640 1645 



65 CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 4 99 2 
His Giy Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 
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50 



ioSO 1655 1'560 

QkG AAC GAT AGT TTT GTG ATT TAT C.^vA GGA GAA CTT AGC GAA AC A AGT 504C 
Glu Asn Asp Ser Phe Vai lie T/r Gin Gly Giu Leu 3er Glu Thr Ser 
1665' " 1670 1675 1530 

C.AA ACT GTT GTG AAA GTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 5033 
Gin Thr Val Vai Lvs Vai Phe Leu Ser T/r Phe lie Giu Aid Thr Gi/ 

1635 i690 L695 

-^T k^C ,K\C CAC TTA TGG GTA CGT GCT AAA TAG CAA AAG GAA ACG ACT 5135 
Asn Lys Asn His Leu Trp Vai Arg Ala Lys Tyr Gin Lys Glu Thr Thr 

1700 1705 i"10 



15 GAT AAG ATC TTG TTC GAG CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5184 
Asp Lys lie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 i720 i725 

TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAG 5232 
20 Ph« Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

GCA TTA A.AG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 5230 
::.la Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Aia Asn Ala 
25 1745 1750 1755 1750 

CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 5323 
Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 

1755 1770 1775 



CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 537 6 

^rg Leu Leu Gin Glu Gin Asn Phe Asp Aia Aia Asn His Trp Phe Arg 

1780 1785 1790 



35 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 5424 
Ti'r Vai Trp Ser Pro Ser Gly T-/r lie Val Asp Gly Lys lie Ala lie 
1795 1800 1805 

TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 5472 
40 Tyr His Trp Asn Val Arg Pro Leu Giu Giu Asp Thr Ser Trp Asn Aia 
1810 1815 1820 

CAA CAA CTC GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5520 
Gin Gin Leu Asp Ser Thr Asp Pro Asp Aia Val Ala Gin Asp Asp Pro 
45 1825 1830 1335 1S40 

ATG CAC TAC AAG GTG GCT ACC TTT ATC GCG ACG TTC GAT CTC CTA ATG 5 5 '^3 
Met His Tyr Lys Val Ala Thr Phe Met Aia Thr Leu Asp Leu Leu Met 

1845 1850 1855 



GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT S6i-S 

Ala Arg Gly Asp Aia Aia Tyr Arg Gin Leu Giu Arg Asp Thr Leu Aia 

I860 1865 1870 



55 GAA GCT AAA ATC TCG TAT AC A CAG GCG CTT AAT CTC TTC GGT GAT GAG 56o4 

Glu Aia Lys Met Trp T/r Thr Gin Ala Leu Asn Leu Leu Gly Asp Giu 
1875 1380 1885 

CCA CAA GTC ATG CTC AGT ACG ACT TGG GCT AAT CCA ACA TTC CGT AAT 5712 

60 Pro Gin Vai Met Leu Ser Thr Thr Trp Aia Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTC CTT ACC CAG 5760 

Aia Aia Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Vai Leu Thr Gin 
65 1305 1910 1915 1920 
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TT3 "T :TC .^.-.T A3C AGO G7A AA.^^ ACC CCO TTG CTA SGA AC A GC: A.-.T t-iO^ 
Leu Arg Leu Asn Ser Ar? Val Lys Thr Pro Leu Leu Giy Thr Ala Asn 

1325 1930 1935 

5 TCC CTG ACC GCT TTA TTC CTG CCG CAG G;^-^ AAT AGC AAG CTC AAA GGC 5356 
Ser Leu Thr Ala Leu Phe Leu Pro Gin -Jlu Asn Ser Lys Leu Lvs Gly 

1340 1945 1950 

TAG TGG CGG AC A CTG GCG CAG CGT ATG TTT AAT TTA CGT CAT AAT CTG 5904 
10 T-yr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 

1955 I960 1965 

TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG TAT GCT AAA CCG GCT 5? 52 
Ser lie Asp Gly Gin Pro Leu ser Leu Pro Leu T-yr AIa Lys Pro Ala 
15 1970 1975 1980 

GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAA GGG GGA 6000 
Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 
1585 1990 1995 2000 



20 



40 



GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC CCT CAA ATG 6048 
Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Met 

2005 2010 2015 



25 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAC CTT ATA CAG TTC GGT AGT 6096 
Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu lie Gin Phe Gly ser 

2020 2025 2030 

TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG GAA GCT ATG AGT CAA 6144 
30 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Mec Ser Gin 

2035 2040 2045 

CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 6132 

Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Met 
35 2050 2055 2060 

CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 6240 
Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2065 2070 2075 2080 



GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 6288 
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 

2085 2090 2095 



45 TAT GAG GAG AAC ATC AAC GCA GGT GAC CAC CCA GCG CTG GCG TTA CGC 63 36 

Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 

2100 2105 2110 

TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6384 
50 Ser Glu Ser Ala He Glu Ser Gin Giy Ala Gin He Ser Arg Met Ala 

2115 2120 2125 

GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 6432 
Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 
55 2130 2135 2140 

GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 6480 
cly Met His ryr Gly Ala He Ala T/z Ala He Ala Asp Gly He Glu 
-i45 2150 2155 2160 

60 

TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 6523 

Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Aia Gin Ser 

2165 2170 2175 

65 GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6 57 6 
Glu He T:i'r Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
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2130 2135 2120 

GCA CAA GCG GAG ATT AAC CAG TTA AJ\C GCG C?JK ZTC GAA TCA CTG TCT 65:4 
Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu S«r 
2195 2200 2205 

ATT CGC CGT CAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AA.^ ACC CAG 66" 2 
He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu L>*s Thr Gin 
2210 2215 2220 

C.^ GCT CAG GCG CAG GCA CA.A CTT ACT TTC TTA AGA AGC AAA TTC AGT 6^20 
Gin Ala Gin Ala Gin Ala Gin L^u Thr Phe Leo Arg Ser Lys Phe Ser 
2225 2230 2235 2240 

AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 6763 
Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly lie Ti'r 

2245 2250 2255 

TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 6316 
Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 

2260 2265 2270 

TCC TAT CA.\ TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 6364 
Ser T/r Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 
2275 2280 2285 

GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 6912 
Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 
2290 2295 2300 

ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 6960 

He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2320 

CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT TO 0=5 
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 

2325 2330 2335 

TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ATA CCT GCA 7 056 
Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 

2340 2345 2350 

TTA TTG GAT AAG GGG GAG GGA AC A GCA GGA ACT AAA GAA AAT GGG TTA 7i04 
Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 

TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152 
Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 

AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7200 
Lys Leu Gly Thr Asp T^'r Pro Asp Ser He Val Gly Ser Asn Lys Val 
2385 2390 2395 2400 

CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT 724 3 
Arg Arg He Lys Gin He Ser Val Ser Leu Pro Ala Leu Vai Giv Pro 

2405 2410 2415 

TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG "29 6 

T/r Gin Asp Vai Gin Ala Mec Leu Ser T/r Gly Gly Ser Thr Gin Leu 

2420 2425 2430 

CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT 7 344 
Fro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 
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OCT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAG CTG CCA TTT GAA ~3 32 
Gly Gin Phe Gin Leu Asp Phe Asn Asp Giy Lys T/r Leu Pro Phe GIu 
2450 2455 2460 

5 GGT ATT GCT CTT CAT GAT CAG GGT AC A CTG AAT CTT CAA TTT CCG AAT 7440 
Giy lie Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 

GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 7438 
10 Ala Thr Asp Lys Gin Lys Ala lie Leu Gin Thr Met Ser Asp lie He 

2485 2490 2495 

TTG CAT ATT CGT TAT ACC ATC CGT TAA 7515 

Leu His He Arg Tyr Thr He Arg * 
15 2500 2505 



(2) INFORMATION FOR SEQ ID NO: 12: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2505 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

30 Met Gin Asn Ser Leu Ser Ser Thr He Asp Thr He cys Gin Lys Leu 
^ . 5 10 15 

Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 

20 25 30 

35 

Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 
35 40 45 

Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
40 50 55 60 

Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
^5 70 75 80 

45 Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 

85 90 95 

Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 

100 105 110 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 

Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
55 130 135 140 

Leu Ala Ser Leu Met Leu ser Gin Lys Asn Met Asp Glu Glu He Ser 
^45 150 155 160 

60 Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 

165 170 175 

Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 

180 185 190 
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Leu Ser Oly Glu Thr Pro r/r His His Aia T/r Giu Thr Val Arg Giu 
195 200 20S 

He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
S 210 215 220 

He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 
225 230 235 240 

10 His He Ser Pro Glu Leu Tyr Asn Leu Leu He- Giu Glu He Pro Glu 

245 250 255 



15 



Lys Asp Glu Ala Ala Leu Asp Thr Leu T^'r Lys Thr Asn Phe Gly Asp 

260 265 270 

He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 

275 280 285 



Gly Val 
20 290 



Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 

295 300 



25 



val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 
305 310 315 320 

Gly Lys Mec Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 

325 330 335 



30 



Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 

340 345 350 

Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 
355 360 365 



Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 
35 370 375 380 

Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 
385 390 395 400 

40 Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 

405 410 415 



45 



50 



Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 

420 425 430 

Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 

435 440 445 

Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 

450 455 460 



55 



vai Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 

465 470 475 480 

Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Ti'r Gly He Ser Glu 

485 490 495 



60 



Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 

500 505 510 

Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 

515 520 525 



Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 
65 530 535 540 
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Asn ^ro Asp 
545 



Lys Ala Val 

5 

Gin MdC Leu 



10 Asn Leu Glu 

595 

lie His Asn 
6L0 

IS 

Gly Tyr Gly 
625 

Lys He Val 

20 

Lys Trp Thr 



25 Thr Thr Leu 

675 

Thr Leu His 
690 

30 

Met Ala Pro 
705 

Ala Tyr Asp 

35 

Thr val Asp 



40 Lys Veil He 

755 

Arg Arg He 
770 

45 

Ser 5er Leu 

785 

Thr Leu Met 

50 

Gin His Ala 



55 Val Thr Asp 

835 

Met Ala Ala 
850 

60 

Thr Gin He 
865 

Ala Val Ser 

65 



Leu Asn Leu Lys 
550 

Leu Lys Arg Ala 
565 

Leu He Thr Asp 
580 

Asn Leu Ser A.5p 



Leu Thr He Ala 

S15 

Asp Thr Asn He 
630 

Glu Thr Leu Leu 
645 

Val Thr Asp Leu 
660 

Thr Pro Glu He 



Gly Lys Glu Ser 

695 

Cys Phe Thr Ser 
710 

Leu Leu Leu Trp 

725 

Gly Phe Trp Glu 
740 

Thr Phe Ala Gin 



Gly Leu Ser Glu 

775 

Leu Val Ala Gly 

790 

Ala Leu Glu Gly 
805 

Ser Leu He Leu 
820 

Val Ala Gin Ala 



Asn Gin Val Glu 

355 

Asp Ala He Leu 
870 

Pro Leu Asp Leu 
885 



Pro Asp Ser Thr Giy 

555 

Phe Gin Val Asn Ala 
570 



Arg Lys Glu Asp Giy 
585 

Leu Tyr Leu Val Ser 
600 

Glu Leu Asn He Leu 

620 

T/r Gin He Thr Asp 

635 

Trp He Thr Gin Trp 
650 

Phe Leu Met Thr Thr 
665 

Ser Asn Leu Thr Ala 
680 

Leu He Gly Glu Asp 

700 

Ala Leu His Leu Thr 

715 

He Asp Gin He Gin 
730 



Glu Val Gin Thr Thr 
745 

Val Leu Ala Gin Leu 
760 

Thr Glu Leu Ser Leu 

780 

Lys Ser He Leu Asp 

795 

Phe His Thr Trp Val 
810 

Ala Ala Leu Lys Asp 
825 

Met Asn Lys Glu Glu 
840 

Lys Asp Leu Thr Lys 

860 

Gin Trp Leu Gin Met 

875 

Ala Gly Met Met Ala 
890 



Asp Asp Gin Arg 

560 

Ser Glu Leu Tyr 
575 



Val He Lys Asn 
590 



Leu Leu Ala Gin 
605 

Leu Val He Cys 



Asp Asn Leu Ala 

640 

Leu Lys Thr Gin 
655 

Ala Thr Tyr Ser 
670 



Thr Leu Ser Ser 
685 

Leu Lys Arg Ala 



Ser Gin Glu Val 

720 

Pro Ala Gin He 

735 



Pro Thr Ser Leu 
750 



Ser Leu He Tyr 
765 

He Val Thr Gin 



His Gly Leu Leu 

800 

Asn Gly Leu Gly 

815 

Gly Ala Leu Thr 
830 

Ser Leu Leu Gin 
845 

Leu Thr ser Trp 



Ser Ser Ala Leu 

380 

Leu Lys Tyr Gly 
895 



irw\j yui fH^A 



lie Asp His Asn r/r Aid Aid Trp Gin Aid Aid Aid Aid Aid L«u Mec 

900 905 9i0 

Aid Asp His Aid Asn Gin Aid Gin Lys Lys Lau Asp Giu Thr Phe Ser 
S 9i5 920 925 

Lys Aid Leu Cys Asn Tyr Tyr lie Asn Aid Vdi Vdi Asp Ser Aid Aid 
930 935 940 

10 Giy Vdi Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu lie Asp Asn 
945 950 955 960 



15 



Gin Vdi Ser Aid Asp Vdi lie Thr Ser Arg lie Aid Giu Aid lie Aid 

965 970 975 

Giy lie Gin Leu Tyr Vdi Asn Arg Aid Leu Asn Arg Asp Giu Giy Gin 

980 985 990 



Leu Aid Ser Asp Vdi Ser Thr Arg Gin Phe Phe Thr Asp Trp Giu Arg 
20 995 iOOO i005 

Tyr Asn Lys Arg Tyr Sar Thr Trp Aid Gly Vdi Ser Giu Leu Vdi Tyr 
iOlO i0i5 i020 

25 Tyr Pro Giu Asn Tyr Vdi Asp Pro Thr Gin Arg lie Giy Gin Thr Lys 

1025 i030 1035 i040 



30 



Mec Met Asp Aid Leu Leu Gin Ser lie Asn Gin Ser Gin Leu Asn Aid 

1045 1050 1055 

Asp Thr Vdi Giu Asp Aid Phe Lys Thr Tyr Leu Thr Ser Phe Giu Gin 

1060 1065 1070 



Vdi Aid Asn Leu Lys Vdi lie Ser Aid Tyr His Asp Asn Vdi Asn Vdi 
35 1075 1080 1085 

Asp Gin Giy Leu Thr Tyr Phe lie Gly lie Asp Gin Aid Aid Pro Giy 
1090 1095 1100 

40 Thr Tyr Tyr Trp Arg Ser Vdi Asp His Ser Lys Cys Giu Asn Giy Lys 
1105 1110 1115 1120 



45 



Phe Aid Aid Asn Aid Trp Gly Giu Trp Asn Lys lie Thr Cys Aid Vdi 

1125 1130 1135 

Asn Pro Trp Lys Asn lie lie Arg Pro Vdi Vdi Tyr Mec Ser Arg Leu 

1140 1145 1150 



Tyr Leu Leu Trp Leu Giu Gin Gin Ser Lys Lys Ser Asp Asp Giy Lys 
50 1155 1160 1165 

Thr Thr lie Tyr Gin Tyr Asn Leu Lys Leu Aid His lie Arg Tyr Asp 
1170 1175 1180 

55 Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Vdi Thr Giu Lys Vdi Lys 

1135 1190 1195 1200 



60 



Asn Tyr Thr Ser ser Thr Asp Aid Aid Giu Ser Leu Giy Leu Tyr cys 

1205 1210 1215 

Thr Gly Tyr Gin Giy Giu Asp Thr Leu Leu Vdi Met Phe Tyr Ser Mec 

1220 1225 1230 



Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Aid Pro Vdi Thr Giy 
65 1235 1240 1245 
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L*u r/r lie Phe Aia Asp Met Ser 5^r Asp Asn Met Thr Asn Aia Jin 
^^50 12S5 1260 

« ^itc'^^^ ^^'^ "^'^ "^^P •'^"'^ '^^'^ Phe Asp Thr Vdl Met 

^ i270 1275 1230 

Aia Asp Pro Asp Ser Asp Asn L/s Lys Vdi He Thr Arg Arg Vai ^sn 

^235 1290 1295 ' 

10 Asn Arg Tyr Ala Glu Asp Tyr Giu He Pro Ser Ser Val Thr Ser Asn 

^^00 1305 1310 



IS 



20 



Ser Asn Ti'r Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Glv 

1320 1325 

y^io^" '^^^ Ala Ala Glu Asp Leu Arg Leu 

1330 1335 ^3^Q 

Ser Thr Asn Met Ala Leu Ser He lie His Asn Gly Tyr Ala Gly Thr 
^^^^ ^^50 1355 

Arg Arg lie Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 

1365 1370 i375 

25 Lys Phe He He T/r Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 

l^fiO 1385 1390 

Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
3y ^^^5 1400 1405 

He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 
1410 1415 1420 

7^ f*?nc^®^ Thr Ala Asp Tyr Asn Gly Gly Th 

-55 1425 1430 1435 



r 
440 



45 



Gin cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 

1445 1450 1455 

40 Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 

1460 1465 147Q 

Lys He ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 

^^^L^*^ '^^^ Asp Asp Gin He Phe Thr Ala 

1450 1495 15QQ 

«;n tcnc^*" '^^^ '^^ Ala Pro Ser Phe Giu Glu 

5W 1505 1510 1515 ^520 

Met He Tyr Gin Phe Asn Asn Leu Thr He Asp cys Lys Asn Leu Asn 

1525 X530 1535 

55 Phe He Asp Asn Gin Ala His He Giu He Asp Phe Thr Ala Thr Ala 

1540 1545 155Q 

Gin Asp Gly Arg Phe Leu Gly Ala Giu Thr Phe He He Pro Val Thr 
1555 1550 1565 

Lys Lys Val Leu Gly Thr Giu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 158Q 



60 



65 



Asn Gly Val Gin Tyr Met: Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
^^^^ 1590 1595 1600 
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T.r Leu Ph. 

1610 

^ Ala Val Leu Ser Met Glu Thr Cln Asn lie Gin Glu Pro Gin Leu ci, 

^^^^ 1630 
Ala Gly Thr^r^r Val Gin Leu Val^Leu Asp Lys r.r Asp^clu Ser He 

H) Hxs civ Thr Asn Lys Ser Phe Ala He Glu Tyr Val A,p phe Lys 
Ciu Asn Asp Ser Phe Val lie Tyr Gin gIv /-i., i 

1665 1670 ^ ?iiic^*'' ^•'^ Thr Ser 

Cln Thr val Val Lys^Val Phe Leu Ser Ty. Phe He Glu Ala Thr^ciy 

Asn Lys Asn His^Leu Trp Val Ar, Ala Lys Tyr Gin Ly. Glu Thr^Thr 



20 



ASP Lys ne^Leu Phe Asp Ar, Thr Asp Glu Lys Asp Pro His Gly Trp 

25 Phe Leu^ser Asp Asp His Lys Thr Ph. Ser Gly Leu Ser Ser Ala Gin 

^'■^^ 1740 

Ala^Leu Lys Asn Asp Ser^Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
30 J.755 |7gQ 

Leu Tyr Ph. Trp Glu^Leu Ph. ,vr Tyr Thr Pro Mec Met Met Ala His 



1775 



33 Ar, L.U L.U Gln^Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Ar, 

1785 2790 



Tyr Val Trp Ser Pro Ser Gly Tvr lie vai Ae« r 

1795 ^ liL Ala He 

1805 

40 ^r His^Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 

^^^5 1820 

^ijs"" ».P M. Vji^M. 

"« Hi, 2° 

1850 2355 

^ Ala Ar« Gly Asp^Ala Ala Tyr Ar. Gln^Leu Glu Ar, Asp Thr^Leu Ala 

Ciu Ala Ly.^M.t Trp ^r Thr Cln Ala Leu Asn L.u Leu G^Asp Glu 

1885 

55 Pro Gin Val Mec Leu ser Thr Thr Trr> * 

1890 {rU ^ '^^^ Ci/ Asn 

^ JJ-^Ala S..r Lys Thr Thr^cin Gin Val Arg cin^oln Val Leu Thr Gin 
^•u Ar. Leu Asn Ser^Ar. Val Lys Thr Pro^Leu Leu Gly Thr Ala iT 

^3 ser Leu Thr Ala^Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys^Gly 

A^*^ 1950 
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T:.r trp Ar,^T.r L.u Cln .r. Met p., .sn Leu Ar, H.s Asn L.u 

1360 ^^.^ 

ser Ile^Asp ciy cin Pro Leu s« Leu Pro Leu Ala Lys Pro Ala 
Asp Pro Lys Ala Leu Leu Ser •/ i 

1385 JJSo ^'"^ f,%%^^* S« Gly Gly 

2000 

».p .~ „o Pro ^„ ™. n.^„u p,. ,j„^^, 

... „, ^„ „ ^;*'^^^ 

15 ^025 2030 

ser Leu Leu^ciy ryr Ser ciu Ar. Gin A.p Ala Glu Ala „ec Ser cm 

2045 



10 



20 .^^^0°^'^ ''"^ lll-'^ Leu Thr Ser lie Ar, „ec 

Cln^Asp Asn Gin Leu Ala Clu Leu Asp Ser Glu Lys Thr Ala Leu cln 



25 val ser Leu Ala Gly^Val Gin Gin Ar. PHe_Asp Ser Tyr Ser Gin Leu 



2090 2095 



ryr Glu Glu Asn^xie Asn Ala cly cju Gin Ar. Ala Leu Ala Leu Ar. 

-ii03 2110 

s.r 01. s„ «. ,„ ^^^^ 

^^^0 2125 

35 j;?o°'^ 5^3%^" Asn lie pn. Gly Leu Ala Asp cly 

^^J^ 2140 

CJJ^H.. „u ryr o.y M.^u. M. 

«) s„ ,u s„ .u^.„ „„ ^„ 

01„ n. ry. ^, ^'"^^^ 

45 2185 2190 

Ala Gin Ala Glu lie Asn ri « r * 

2195 " ^^^n^'" Ser Leu Ser 

^^00 2205 



50 



ne Arj^Ar, Glu Ala Ala Glu Mec Gin Lys Glu Tyr Leu Lys Thr Gin 

^^^^ 2220 
Oln^Ala Cln Ala Gin Ala^cin Leu Thr Ph. LeuAr, Ser Lys Phe ser 



55 



223 5 



2240 



Asn Cln Ala Leu TVr^ser Trp Leu Ar, Gly Ar. Leu Ser cly ^r 



2250 2255 



Cln Phe xyr^^.p 

hU 2265 2270 

ser Tyr Gln^Trp Glu Ala Asn Asp Asn ser He Ser Phe Val Lys Pro 

2285 

65 tlto^'^ ]^295''^ Ala Leu 

2235 2300 
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5 
10 

15 
20 
25 
30 
35 

40 
45 

50 
55 
60 



lie Jin Asn Leu Aia Gin Mec Zi-j Giu Ala T/r Leu Lys Trp Glu Ser 
2305 2310 2315 ZjZ 

Arg Ala Leu Giu Val Glu Arg Thr Val 3er Leu Ala Val Val T/r Asp 

2325 2330 2335 

3er Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin lie Pro Ala 

2340 2345 2350 

Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 

Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 

Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val Gly Ser Asn Lys Val 
2385 2390 2395 240( 

Arg Arg lie Lys Gin lie Ser Val Ser Leu Pro Ala Leu Val Gly Pro 

2405 2410 2415 

T-/r Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 

2420 2425 2430 

Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 

Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
2450 2455 2460 

Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 248C 

Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp lie He 

2485 2490 2495 

Leu His He Arg Tyr Thr He Arg • 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



-136- 



SUBSTm nr shpft mi ii p 9K\ 



wo 97/17432 



PCT/US96/18003 



10 



20 



35 



40 



55 



60 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 14: 

Met Gin Asn Ser Gin Thr Phe Ser Val Gly Giu Leu 
1 5 AO 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 
25 1 5 10 

(2) INFORMATION FOR SEQ ID NO: 16: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gin Asn ser Leu 
1 5 



45 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

Ala Phe Asn He Asp Asp Val Ser Leu Phe 
1 5 10 
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10 



25 



30 



45 



50 



IMFCRiH-MICri for SEO id N0:I5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRA1>IDEDNESS : single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18; 



Phe He Vdl Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
15 I 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 19: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

lie Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He Gly Ser 
^ 5 . 10 15 



Leu Gin Leu Phe He 

35 20 



(2) INFORMATION FOR SEQ ID NO: 20: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Tyr T/r He Gin Ala Gin Gin Leu Leu Gly Pro 
i 5 10 



55 (2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acid 

^0 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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lii) MOLECULE TYPE: peptide 

5 ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Oiy lie Asp Ala Val L«u Set Met Clu Thr Gin Asn lie Gin Ciu Pro 
X 5 10 15 

10 Gin Leu Gly Ala Giy Thr T/r Val Gin Leu 

20 25 



15 



30 



40 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 22: 



ti) SEQUE^^CE characteristics: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 
45 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 24: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Val Leu Giy Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly 
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15 iO 15 

v<al Gin Tyr Mec Gin lie 

20 



'2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
lU (A) LENGTH: 6005 base pairs 

(B) TYPE: nucleic acid 

(C) STRA^JDEDN£SS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: RBS 

(B) LOCATION: 1..9 



20 



25 



40 



50 



60 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 16.. 3585 

(D) OTHER INFORMATION: /products "P8" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA 51 
•JO Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu 

1 5 10 

GCG CGC CGT GAT GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC 99 
Ala Arg Arg Asp Ala Leu Val Ala His Tyr lie Ala Thr Gin Val Pro 
35 15 20 25 

GCA GAT TTA AAA GAG ACT ATC CAG ACC GCG GAT GAT CTG TAG GAA TAT 147 
Ala Asp Leu Lys Glu Ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr 
30 35 40 



CTG TTG CTG GAT ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG 195 
Leu Leu Leu Asp Thr Lys lie Ser Asp Leu Val Thr Thr Ser Pro Leu 
45 50 55 60 



45 TCC CAA GCG ATT GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG 243 
Ser Glu Ala lie Giy Ser Leu Gin Leu Phe He His Arg Ala He Glu 

65 70 75 



GGC TAT GAC GGC ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT 291 

Gly Tyr Asp Giy Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp 

80 85 90 



GAA CAG TTT TTA TAT AAC TGG CAT AGT TTT AAC CAC CGT TAT AGC ACT 339 
Glu Gin Phe Leu Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr 
55 95 100 105 



TGG GCT GGC AAC GAA CCG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT 387 
Trp Ala Gly Lys Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp 
110 115 120 

CCA ACA TTC CCA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA 43 5 
Pro Thr Leu Arg Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin 
125 130 135 
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10 



15 



20 



25 



30 



35 



40 



GOT ATT TCT C.AA GGG KK\ TTA «.\A ACT GAA TTA GTC GAA TCT A^=^ 1 
Cly He Sex Gin Gly Lys Leu Lys Ser Glu Leu Vai Glu Ser L/s Lau 

145 150 155 

CGT GAT TAT CTA ATT ACT TAT CAC ACT TTA CCC ACC CTT GAT TAT ATT 

Ara ASP Tyr Leu He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He 

i60 165 170 

ACT GCC TCC CAA GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT OGC CGT 
Thr Aia Cys Gin Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg 
175 180 185 

ACA CAC AAT CCA CCC TAT CCA TTT TAT TGG CGA AAA TTA ACT TTA GTC 
Thr Gin Asn Ala Pro Tyr AU Phe Tyr Trp Arg Lys Leu Thr Leu Val 
190 195 200 

ACT GAT GGC CGT AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA CCA 

Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 
205 210 215 220 

ATT AAT GCC GGG ATT ACT GAG CCA TAT TCA GGG CAT GTC GAG CCT TTC 
He Asn Ala Gly He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe 

225 230 235 

TGG GAA AAT AAC AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA 
Trp Glu Asn Asn Lys Leu His He Arg Trp Phe Thr He ser Lys Glu 

240 245 250 

GAT AAA ATA GAT TTT CTT TAT AAA AAC ATC TGG CTG ATG ACT AGC GAT 
Asp Lys He Asp Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp 
255 260 265 

TAT ACC TGG CCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT CAC 
Tyr Ser Trp Ala ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp 
270 275 280 

TAC AAT ACA GTT GGA GCA ACA GGA TCA TCA AGC CCC ACT GAA CTA GCT 
Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 
285 290 295 300 

TCA CAA TAT GGT TCT CAT GCT CAG ATG AAT ATT TCT GAT CAT GGG ACT 
Ser Gin Tyr Gly Ser Asp Ala Gin Met Asn He Ser Asp Asp cly Thr 

305 310 315 



433 



531 



579 



627 



675 



723 



771 



819 



867 



915 



963 



45 CTA CTT ATT TTT CAG AAT 
Val Leu He Phe Gin Asn 

320 

ACC TTA TGT TAT CAC TCT 

50 Thr Leu cys Tyr Asp Ser 

335 

CGA ACT CCA AAT TTA TCG 
Gly Ser Ala Asn Leu Ser 
55 350 

ATC TGT CAT GGA CAA ACT 
Met cys His Gly Gin Ser 
365 370 

60 

CTC TCT ATT AAT ACA ATA 
Leu Ser He Asn Thr He 

385 

65 GAT GGA AAA CAA TTT ACA 

Asp Gly Lys Gin Phe Thr 



CCC GGC GGA CCT ACT CCC ACT ACT GGA CTG 1011 
Ala Gly Cly Ala Thr Pro Ser Thr Cly Val 
325 330 

GGC AAC CTG ATT AAG AAC CTA TCT ACT ACA 1059 
Gly Asn Val He Lys Asn Leu Ser Ser Thr 
340 345 

TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC 1107 

Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 
355 360 

TAC AAT CAT AAT AAC TAC TCC AAT TTT ACA 1155 
Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 

375 330 

GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 1203 

Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 

390 395 

CCA CCT TCT CGT TCT GCC ATT CAT TTA CAC 1251 
Pro Pro Ser Cly Ser Ala He Asp Leu His 
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400 



405 



410 



CTC CCT AAT TAT CTA CAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT 1299 
Leu Pro Asn Tyr Val Asp Leu Asn Ala Leu Leu Asp lie Ser Leu Asp 
415 420 425 

TCA CTA CTT AAT TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT Aja.T CCG 13 47 

5er Leu Leu Asn Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro 
430 435 440 

GTT GAT AAT TTC ACT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC 1395 

Val Asp Asn Phe Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu lie Phe 
445 450 455 460 

TTC CAT ATT CCG TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT 1443 
Phe His He Pro Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg 

465 470 475 

TAC GAA GAC GCG GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT 1491 
T/r Glu Asp Ala Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly 

480 485 490 

TAT CGC GAT GCT AAT CGC CAG CTC ATT ATG GAT GGC ACT AAA CCA CGT 153 9 
Tyr Arg Asp Ala Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg 
495 500 505 

TAT TGG AAT GTG ATG CCA TTG CAA CTG GAT ACC CCA TGG GAT ACC ACA 1537 
Tyr Trp Asn Val Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr 
510 515 520 

CAG CCC GCC ACC ACT CAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG 163 5 
Gin Pro Ala Thr Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met 
525 530 535 540 

CAT TAC AAG CTC GCC ATA TTC CTC CAT ACC CTT CAT CTA TTG ATT GCC 1683 

His Tyr Lys Lou Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala 

545 550 555 

CCA CGC CAC ACC CCT TAC CCT CAA CTT CAA CCC GAT ACT CTA GTC CAA 1731 
Arg Gly Asp Ser Ala Tyr Arg Cln Leu Glu Arg Asp Thr Leu Val Glu 

560 565 570 

GCC AAA ATC TAC TAC ATT CAG CCA CAA CAG CTA CTG GGA CCG CGC CCT 1779 
Ala Lys Met Tyr Tyr He cln Ala Gin Gin Leu Leu Gly Pro Arg Pro 
575 580 585 

GAT ATC CAT ACC ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA 1827 
Asp He His Thr Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu 
590 595 600 

GCT GGC GCT ATT GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG CTG ATG 1875 

Ala Gly Ala He Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met 
605 610 615 620 

ACG TTC GCT GCC TCG CTA ACC CCA GGC CAT ACC CCA AAT ATT CGC GAC 1923 
Thr Phe Ala Ala Trp Leu Ser Ala Gly Asp Thr Ala Asn He Gly Asp 

625 630 635 

GGT GAT TTC TTG CCA CCC TAC AAC CAT CTA CTA CTC GCT TAC TCG CAT 1971 
Gly Asp Phe Leu Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp 

640 645 650 



rA^ CTT GAG TTA CGC CTA TAC AAC CTG CGC CAC AAT CTG AGT CTG GAT 2019 
Lys Leu Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp 
655 660 665 
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ZCT ZK^ CCG CTA AAT CTG CCA CTC TAT GCC ACC CCG CTA GAC CCC AAA I'li' 
Giy oin Pro Leu Asn Leu Pro Leu T/v Aia Thr Pro Val Asp Pro L/s 
^"0 675 680 

5 ACC CTG CAA CGC CAG CAA GCC CGA GGG GAC GGT ACA GCC AGT ACT CCG 2115 
Thr Leu Gin Arg Gin Gin Aia Giy Giy Asp Cly Thr Giy Set Ser Pro 
635 690 695 too 

GCT GGT GGT CAA CGC AGT GTT CAG CGC TCG CGC TAT CCG TTA TTG CTA 2165 
10 Aia Giy Giy Gin Giy Ser Vai Gin Cly Trp Arg Tyr Pro Leu Leu Vai 

705 710 715 

GAA CGC GCC CGC TCT GCC CTG ACT TTG TTG ACT CAG TTC CGC AAC AGC 2211 
Glu Arg Ala Arg Ser Aia Vai Ser Leu Leu Thr Gin Phe Giy Asn Ser 
15 720 725 730 



20 



30 



40 



50 



60 



TTA CAA ACA ACC TTA GAA CAT CAG GAT AAT GAA AAA ATG ACC ATA CTG 2259 

Leu Gin Thr Thr Leu Giu His Gin Asp Asn Glu Lys Met Thr lie Leu 
735 740 745 

TTG CAG ACT CAA CAG GAA GCC ATC CTG AAA CAT CAG CAC GAT ATA CAA 2307 

Leu Gin Thr Gin Gin Glu Aia lie Leu Lys His Gin His Asp lie Gin 
750 755 760 



25 CAA AAT AAT CTA AAA GGA TTA CAA CAC AGC CTG ACC CCA TTA CAG GCT 23 55 
Gin Asn Asn Leu Lys Giy Leu Cln His Ser Leu Thr Ala Leu Gin Aia 
765 770 775 T30 



AGC CGT CAT CGC GAC ACA TTG CCG CAA AAA CAT TAC AGC GAC CTG ATT 2403 
Ser Arg Asp Cly Asp Thr Leu Arg Cln Lys His Tyr Ser Asp Leu lie 

785 790 795 



AAC GGT GGT CTA TCT CCG CCA GAA ATC GCC GGT CTG ACA CTA CGC AGC 245 i 
Asn Giy Cly Leu Ser Ala Aia Giu lie Aia Cly Leu Thr Leu Arg Ser 
35 800 805 810 



ACC GCC ATG ATT ACC AAT CGC GTT CCA ACC CGA TTG CTG ATT GCC CGC 2499 
Thr Aia Met lie Thr Asn Giy Vai Aia Thr Cly Leu Leu lie Ala Giy 
81S 820 825 

GGA ATC GCC AAC CCG GTA CCT AAC CTC TTC GGG CTC GCT AAC GGT GGA 2547 
Giy lie Ala Asn Ala Vai Pro Asn Vai Phe Cly Leu Ala Asn Cly Cly 
830 835 840 



45 TCG GAA TGG GGA CCG CCA TTA ATT GCC TCC CGC CAA CCA ACC CAA GTT 2595 

Ser Giu Trp cly Aia Pro Leu lie Cly Ser Giy cln Ala Thr Gin Vai 
845 850 855 860 



GGC GCC GCC ATC CAC CAT CAG AGC CCG GCC ATT TCA CAA CTG ACA CCA 2643 
Giy Aia Cly lie cln Asp Cln Ser Aia cly lie Ser Glu Vai Thr Ala 

865 870 875 



GGC TAT CAG CGT CGT CAG GAA GAA TGG CCA TTG CAA CGC GAT ATT GCT 2691 
Cly Tyr Gin Arg Arg Gin Giu Glu Trp Ala Leu Gin Arg Asp lie Ala 
55 830 885 890 



GAT AAC GAA ATA ACC CAA CTG GAT GCC CAG ATA CAA AGC CTG CAA GAG 2739 
Asp Asn Glu lie Thr Gin Leu Asp Ala Gin lie Gin Ser Leu Gin Glu 
895 900 905 

CAA ATC ACC ATG CCA CAA AAA CAC ATC ACC CTC TCT GAA ACC GAA CAA 2737 
Gin lie Thr Met .-^ia Cln Lys Cln lie Thr Leu Ser Glu Thr Glu Cln 
9iO 915 920 

65 CCG AAT GCC CAA GCC ATT TAT CAC CTG CAA ACC ACT CCT TTT ACC GCC 2835 
Aia Asn Aia Gin Aia He Tyr Asp Leu Gin Thr Thr Arg Phe Thr Cly 
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lU 



30 



50 



325 930 935 340 

CAG CCA CTC TAT AAC TGG ATG GCC GOT CGT CTC TCC CCG CTC TAT TAC 2 333 

Gin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu T/r Tyr 

945 950 955 

CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 293 X 

Gin Met Tyr Asp Ser Thr Leu Pro lie Cys Leu Gin Pro Lys Ala Ala 

960 965 970 

TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 297 9 

Leu Val Gin Glu Leu Gly Glu Lys Clu Ser Asp Ser Leu Phe Gin Val 

975 980 935 



15 CCG GTG TGG AAT GAT CTG TGG CAA GGG CTG TTA GCA GGA GAA GGT TTA 3027 
Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu 
990 995 1000 

AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3075 
20 Ser Ser Glu Leu Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly 
1005 1010 1015 1020 

ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3123 
He Gly Leu Glu Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 
25 1025 1030 1035 

ACA GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3171 
Thr Gly Thr Leu Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr 

1040 1045 1050 



GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GGG GAT ATC TTC 3219 
Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe 
1055 1060 1065 



35 CAA GCA ACA CTG GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3 267 
Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 
1070 1075 1080 

TTG GGT AAC GAG AAG AAA CGT CGT ATT AAA CGT ATC GCC GTC ACC CTG 3315 

40 Leu Gly Asn Clu Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu 
1065 1090 1095 1100 

CCA ACA CTT CTG GGG CCA TAT CAA GAT CTT GAA GCC ACA CTG GTA ATG 3363 

Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met 
45 1105 1110 1115 

GGT GCC GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 3411 
Gly Ala Glu He Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 

1120 1125 1130 



TTT GTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GGT CGA 3459 

Phe Val Thr Asp Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg 
113.5 1140 1145 



55 GAT GCA ACA ACC CGC ACA CTG GAG CTC AAT ATT TTC CAT GCG GGT AAA 3507 

Asp Ala Thr Thr Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys 
1150 1155 1160 

GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3555 

60 Glu Gly Thr Gin His Glu Leu Val Ala Asn Leu Ser Asp He He Val 
H65 1170 1175 1180 

CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTCTTTTC TTTGTCGATT 3605 

His Leu Asn Tyr lie He Arg Asp Ala * 

65 1185 1190 
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IS 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



ACAGGTCCCT ATCAGGGCCC TGTTATTAAG CACTACTTTA TSCAGGATTC ACCACAACTA 3 to 5 
TCCATTACAA CCCTCTCACT TCCCAAACCT CCCGCTCCTA TCAATCCCAT GGGAGAAGCA }'Zl 
CTGAATGCTC CCGCCCCTGA TCGAATCGCC TCCCTATCTC TGCCATTACC CCTTTCGACC 3 7 85 
GGCAGAGCGA CGGCTCCTCG ATTATCGCTG ATTTACAGCA ACAGTGCAGG TAATGGGCCT 3345 
TTCGGCATCG GCTGCCAATG CGCTGTTATG TCCATTAGCC GACGCACCCA ACATCCCATT 3905 
CCACAATACG GTAATGACGA CACGTTCCTA TCCCCACAAG GCGAGGTCAT GAATATCCCC 2965 
CTGAATGACC AAGGGCAACC TGATATCCGT CAAGACGTTA AAACGCTCCA AGGCCTTACC 4025 
TTGCCAATTT CCTATACCGT CACCCGCTAT CAAGCCCCCC AGATCCTGGA TTTCAGTAAA 4085 
ATCGAATACT GGCAACCTGC CTCCGCTCAA GAAGGACGCG CTTTCTGGCT GATATCCACA 4145 
CCGGACGGGC ATCTACACAT CTTAGGGAAA ACCGCGCACG CTTGTCTCGC AAATCCGCAA 4205 
AATGACCAAC AAATCGCCCA GTCGTTGCTG GAAGAAACTG TGACCCCACC CGCTGAACAT 4265 
GTCAGCTATC AATATCGAGC CGAAGATGAA GCCCATTGTG ACGACAATGA AAAAACCGCT 4325 
CATCCCAATG TTACCGCACA GCCCTATCTG GTACAGGTGA ACTACAGGCA ACATCAAACC 4335 
ACAAGCCAGC CTGTTCGTAC TGGATAACGC ACCTCCCGCA CCGGAAGAGT GGCTCTTTCA 4445 
TCTGGTCTTT GACCACGGTG AGCGCGTACC TCACTTCATA CCGTCCCAAC ATGGGATGCA 4505 
GGTACAGCGC AATGGTCTGT ACGCCCGCAT ATCTTCTCTC GCTATCAATA TGGTTTTCAA 4565 
GTGCGTACTC GCCGCTTATC TCAACAAGTC CTCATGTTTC ACCGCACCGC GCTCATGGCC 4625 
GGAGAAGCCA GTACCAATGA CGCCCCGGAA CTGGTTCGAC GCTTAATACT GGAATATGAC 4685 
AAAAACGCCA GCGTCACCAC GTTCATTACC ATCCCTCAAT TAAGCCATCA ATCGGACGGG 4745 
AGGCCAGTCA CCCAGCCACC ACTAGAACTA GCCTGGCAAC GGTTTGATCT GGAGAAAATC 4805 
CCGACATCCC AACGCTTTGA CGCACTAGAT AATTTTAACT CGCAGCAACG TTATCAACTG 4365 
GTTGATCTGC GGGGAGAAGG GTTGCCAGGT ATCCTGTATC AAGATCGAGC CGCTTGGTGG 4925 
TATAAAGCTC CGCAACGTCA GGAAGACGGA GACAGCAATG CCCTCACTTA CGACAAAATC 4985 
GCCCCACTGC CTACCCTACC CAATTTGCAG GATAATCCCT CATTGATCCA TATCAACGGA 5045 
GACGGCCAAC TCGATTCGGT TGTTACCGCC TCCCCTATTC GCGGATACCA TAGTCAGCAA 5105 
CCCGATGGAA AGTGGACGCA CTTTACCCCA ATCAATGCCT TGCCCGTGGA ATATTTTCAT 5165 
CCAAGCATCC AGTTCGCTGA CCTTACCGGG GCAGGCTTAT CTGATTTAGT GTTCATCGGG 5225 
CCGAAAAGCG TGCGTCTATA TGCCAACCAG CGAAACGGCT GCCGTAAAGG AGAAGATGTC 5235 
CCCCAATCCA CAGCTATCAC CCTGCCTCTC ACAGGCACCG ATCCCCCCAA ACTGGTGGCT 5345 
TTCAGTGATA TGCTCGCTTC CGGTCAACAA CATCTGGTGG AAATCAAGGG TAATCGCGTC 5405 
ACCTGTTGGC CGAATCTAGG GCATGCCCGT TTCGGTCAAC CACTAACTCT GTCAGGATTT 5465 
AGCCAGCCCC AAAATAGCTT CAATCCCGAA CGCCTCTTTC TGGCGGATAT CGACGGCTCC 5525 
GGCACCACCC ACCTTATCTA TCCCCAATCC GGCTCTTTGC TCATTTATCT CAACCAAAGT 5585 
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JCT.^ATCAGT TTGATGCCCC GTTGACATTA GCGTTCCCAC AAGGCGTACA ATTTGACAiC S':45 
ACTTCCCAAC TTCAAGTCGC CGATATTCAG GGATTACGGA TAGCCACCTT GATTCTGACT ST J? 
5 GTGCCACATA TCCCGCCACA TCACTCGCGT TCTGACCTGT CACTGACCAA ACCCTGGTTG 5~65 
TTGAATGTAA TCAACAATAA CCGGGCCGCA CATCACACGC TACATTATCG TACTTCCGCG 5325 
CAATTCTCGT TCGATGAAAA ATTACACCTC ACCAAAGCAG CCAAATCTCC GGCTTGTTAT 5335 
CTGCCGTTTC CAATGCATTT GCTATGGTAT ACCGAAATTC AGGATGAAAT CAGCGGCAAC 5945 
CGGCTCACCA- GTCAAGTCAA CTACAGCCAC GGCGTCTGGG ATGGTAAAGA GCGGGAATTC 6005 



15 



(2) INFORMATION FOR SEQ ID NO: 26: 



(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1190 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



25 



30 



(ii) MOLECULE TYPE: protein 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 

20 25 30 



Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 

35 35 40 45 

Thr Lys He Ser Asp Leu Vai Thr Thr Ser Pro Leu Ser Giu Ala He 

50 55 60 

40 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 

65 70 75 80 



45 



Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Giu Gin Phe Leu 

85 90 95 

Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 

100 105 110 



Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 
50 115 120 125 

Leu Asn Lys Thr Giu He Phe Thr Ala Phe Giu Gin Gly He Ser Gin 

130 135 140 

55 Gly Lys Leu Lys Ser Giu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 
145 150 155 160 



60 



He Sar Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 

165 170 175 

Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Aia 

130 185 190 



Pro Tyr Aia Phe Tyr Trp Arg Lys Leu Thr Leu Vai Thr Asp Gly Giy 
65 195 200 205 
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^fn "^-P '■'^P Ala He Asn Ala Slv 

215 220 

5 lU Sec Glu Ala Tyr Ser Cly His Val Glu Pro Phe Trp Glu Asn Asn 
"5 230 235 

L/s Leu His lU Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asd 

245 250 



10 



255 



Phe val T/r Lys Asn lie Trp Val Met Ser Ser Asp T/r Ser Trp Ala 

^ Dv 265 



270 



ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Ti'r Asn Arg Val 



230 285 
Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr 

* 3' U 



295 



300 



Gly 



20 Ser Asp Ala Gin Met Asn lie ser Asp Asp Gly Thr Val Leu He Phe 



25 



30 



40 



45 



55 



60 



310 



315 



320 



Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu cys Tvr 

325 330 



335 



Asp ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 

345 350 

Leu Ser Ser Lys Asp T/r Ala Thr Thr Lys Leu Arg Met cys His ciy 

355 360 365 

Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 

375 380 

35 Thr lie Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 



390 



395 



400 



Phe Thr Pro Pro Ser Gly Ser Ala lie Asp Leu His Leu Pro Asn Tyr 



405 410 



Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 
. 420 425 



430 



Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 
435 440 



445 



Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro 



460 



50 Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu 



465 475 Asp AU 

475 430 

Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 

485 490 495 

Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 

500 505 510 

Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
515 520 525 

Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 



535 



540 



65 Aia He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 



550 



555 



560 
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Ala r/r Arg Gin Leu Giu Arg Asp Thr Leu VaI Glu Aid Lys Met T.t 

555 5"0 5?5 

5 T-/r He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 

530 585 590 



10 



Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Giu Ala Gly Ala lie 

595 600 605 

Ala Thr Pro Thr Phe Leu 5er Ser Pro Giu Val Met Thr Phe Ala Ala 

610 615 620 



Trp Leu Ser Ala Gly Asp Thr Aia Asn lie Gly Asp Giy Asp Phe Leu 
15 625 630 635 640 

Pro Pro T/T Asn Asp Vai Leu Leu Giy Tyr Trp Asp Lys Leu Giu Leu 

645 650 655 

20 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 

660 665 670 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Asn Leu Pro Leu Tyr Aia Thr Pro Vai Asp Pro Lys Thr Leu Gin Arg 
675 680 685 

Gin Gin Ala Giy Gly Asp Giy Thr Gly Ser Ser Pro Aia Gly Gly Gin 
690 695 700 

Gly Ser Vai Gin Giy Trp Arg Tyr Pro Leu Leu Val Giu Arg Aia Arg 
■^05 710 715 720 

Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 

725 730 735 

Leu Giu His Gin Asp Asn Giu Lys Met Thr lie Leu Leu Gin Thr Gin 

740 745 750 

Gin Giu Aia He Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu 
755 760 765 

Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Giy 
770 775 780 « H / 

Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu 
■^95 790 795 300 

Ser Ala Ala Giu He Aia Gly Leu Thr Leu Arg Ser Thr Aia Met He 

805 810 815 

Thr Asn Gly Val Aia Thr Gly Leu Leu He Ala Gly Gly lie Aia Asn 

820 825 830 

Ala Val Pro Asn Vai Phe Gly Leu Ala Asn Giy Gly Ser Giu Trp Gly 
835 840 845 

Ala Pro Leu He Giy Ser Gly Gin Aia Thr Gin Val Giy Ala Giy He 
850 855 860 

Cln Asp Gin Ser Ala Giy He Ser Giu Vai Thr Aia Gly Tyr Gin Arg 

870 875 880 

Arg Gin Giu Giu Trp Aia Leu Gin Arg Asp He Aia Asp Asn Giu He 

885 890 895 

Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Giu Gin He Thr Met 

900 905 910 
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Al. >.ln Lvs Gin lie Thr L«u Clu Thr aiu Cln AU Asn a;. .In 

5 Aid lie Tyr Asp Leu Gin Thr Thr Arg Phe Thr ci • -ir, r 

2 *i 0 

Asn Trp Met Ala CI-, Arg Leu Ser Ala Leu Vyr r.r cin Mec Tyr Asp 

S« T.r Leu Pro lie Cys Leu cln Pro L.s Ala Ala Leu Val Oln Clu 

Leu 01, Clu L.S Glu ser Asp ser Leu Phe Gin Val Pro Val Trp Asn 

ASP Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser S.r Glu Leu 

AOOO i005 
20 Gin Lys^Leu Asp Ala lie- Trp Leu Ala Ar, Gly Gly He Gly Leu Glu 

Aia^Ile Ar, Thr Val Ser^Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu 
25 1035 

S« Glu Asn lie Asn^Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser 

1050 1Q55 



30 



=.y =.y va, Tjr ..a „. ru. CIV a. ^ A,. Thr ^ 

IwOD 1 niri 



1070 



Asp Leu Ser Gin Lau Glv Leu Asn ^c^r. 

1075 ^ t^L'^*" ^^"^ A*'" 1*" Giy Asn Glu 



1080 1085 



35 Lys Lys^Ar. Ar, He Lys Arg n, Ala Val Thr Leu Pro Thr Leu Leu 

Gly^Pro ryr Gin Asp Leu Glu Ala Thr Leu Val „et Gly Ala Glu He 
40 1115 ii2o 

Ala Ala Leu Ser His^Gly Vai Asn Asp Gly^Gly Arg Phe Val Thr Asp 

^5 Phe Asn ASP ser^Arg Phe Leu Pro Phe ciu Gly Arg Asp Ala Thr Thr 

Gly Thr Leu^Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin 

1160 iig5 

50 His Glu Leu Val Ala Asn Leu Ser Asd lie li« v.i u r 

1170 ii-Tc ^ Asn T-/r 



55 



60 



65 



lie lie Arg Asp Ala ♦ 

^^35 aso 



i2} INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1881 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic: 
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10 



15 



ii:<) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: i . . 1881 

(D) OTHER INFORMATION: /producer -P8" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

ATG TCT CAA TCT TTA TTT ACA CAA ACG TTG ?JiA GAA GCG CGC '"GT GAT 
Met ser Glu Ser Leu Phe Thr Gin Thr Leu Li's Giu Ala Arg Arg Asp 

OCA TTG GTT CCT CAT TAT ATT GCT ACT CAG GTG CCC CCA GAT TTA AAA 
Aid Leu Val Ala His Tyr lie Ala Thr Gin Val Pro Ala Asp Leu Lys 

20 25 30 



43 



96 



20 



144 



GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT CTG TTC CTC GAT 
Giu ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu AsJ 

40 45 

Thr ^ r7 ^ ^^"^ ^"^ °^ ^^"^ CTC TCC GAA GCG ATT 192 

25 50 "^^^ ■^'^ ^^'^ ^^'^ 



55 60 

^5 S f"^ TTT ATT CAT CGT GCG ATA GAG GGC TAT CAC GGC 240 

01/ ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 
30 ^5 80 

Th^ ^^'^ TAT TTT GCC CAT CAA CAC TTT TTA 283 

Thr Leu Ala Asp Ser Ala Lys Pro Tyr Ph. Ala Asp clu Gin Phe Leu 

85 90 95 

^ A^I IT "^^"^ "^^"^ °CT CGC AAG 335 

Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 

105 110 



35 



40 



50 



G?J *^ r° ^ T TAT ATT GAT CCA ACA TTC CGA 334 

Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 
^^5 120 125 

TTC AAT AAC ACC GAG ATA TTT ACC GCA TTT GAA CAA GGT ATT TCT CAA 432 

45 mS ^'^'^ ''^^ •^^'^ Ii« ser gJS 

135 140 

GCG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA CGT GAT TAT CTA 430 
Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg AsJ llu 

ISO 155 

iT. ser JJr A«n '^'^^ ^" ^^"^ "^^T ACT GCC TCC CAA 523 

lie Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr lie Thr Ala Cys Gin 

1»5 170 

S?v ^« f^I ^^ ^'^'^ ATC rrc TTT ATT GGC CGT ACA CAG AAT CCA 57 6 

Gly Lys Asp Asn Lys Thr lie Phe Phe He Gly Arg Thr Gin Asn Ala 

180 185 190 

Pro ?Jr Ti^ IT I" '^^^ ^ '"^ TTA GTC ACT GAT GCC GGT 624 

Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp civ Glv 
1'5 200 205 ' 

IS JC^ Pr^ f'''^ "^^^ ^'^'^ ^'^^ *TT AAT GCC GCG 6" 2 

65 210 ^^'^ "^""P Ala Giy 



55 



6(1 
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5 



10 



15 



25 



35 



45 



ATT ACT 3 AG GCA TAT TCA ZGC CAT 3Tr GAG Z7T TTC TGG GAA .^LAT .^.-.C ' 
lie i^r Giu Aid T-/r Ser Gly His Vai Giu Pro Fhe Trp Ciu Asn Asn 
-225 230 235 240 

.K\G CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA GAT AAA ATA GAT "53 
Lys L«u His lie Arg Trp Phe Thr lie Ser Lys Giu Asp Lys lie Asp 

245 250 255 

TTT GTT TAT AAA .^AC ATC TGG GTC ATC ACT AGC GAT TAT AGC TGG GCA 316 

Phe Vdl Tyr Lys Asn lie Trp Vai Met Ser Ser Asp Tyr Ser Trp Ala 

260 265 270 

TCA AAG AAA AAA ATC TTC GAA CTT TCT TTT ACT GAC TAC AAT AGA GTT 364 
Ser Lys Lys Lys lie Leu Clu Leu Ser Phe Thr Asp Tyr Asn Arg Vai 
275 280 285 



GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT TCA CAA TAT GGT 912 
Gly Ala Thr Gly Ser Ser Ser Pro Thr Giu Val Ala Ser Gin Tyr Gly 
20 290 295 300 



TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT GTA CTT ATT TTT 960 

Ser Asp Ala Gin Met Asn lie Ser Asp Asp Gly Thr Vai Leu lie Phe 
305 310 3X5 320 

CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG ACG TTA TX3T TAT 1008 

Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 

325 330 335 

30 GAC TCT GCC AAC GTG ATT AAG AAC CTA TCT AGT ACA GGA AGT GCA AAT 1056 

Asp Ser Gly Asn Val lie Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 

340 345 



TTA TCG TCA AAC GAT TAT GCC ACA ACT AAA TTA CGC ATG TCT CAT GGA 1104 

Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met cys His Gly 
355 360 365 



CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA CTC TCT ATT AAT 1152 
Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser lie Asn 
^ 370 375 380 



ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA GAT GGA AAA CAA 1200 
Thr lie Giu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 

TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC CTC CCT AAT TAT 1248 
Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 

405 410 415 

50 GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT TCA CTA CTT AAT 1296 
Vai Asp Leu Asn Ala Leu Leu Asp tie Ser Leu Asp Ser Leu Leu Asn 

420 425 430 

TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG GTT GAT AAT TTC 13 44 
55 Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 

435 440 445 

AGT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC TTC CAT ATT CCG 13 92 
Ser Gly Pro Tyr Gly He Tyr Leu Trp Giu He Phe Phe His lie Pro 
OO 450 455 460 

TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT TAC GAA GAC GCG 1440 

Phe Leu Val Thr Val Arg Met Gin Thr Giu Gin Arg Tyr Giu Asp Ala 
465 470 475 480 

GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT TAT CGC GAT GCT 1488 
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Asp Thr Trp T/r Lys r/r lie phe Arg Ser Ala Giy r>r Arg Asp Aia 

4dS 430 435 

.:JlAT GGC CAG CTC ATT ATG GAT GGC ACT AAA CCA CGT TAT TGG Ai^.T CTG 1536 
5 Asn Gly Gin Leu lie Met Asp Giy Ser Lys Pro Arg T*/r Trp Asn Val 

500 505 510 

ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA CAG CCC GCC ACC 1534 
Mec Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Aid Thr 
10 515 520 525 

ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG CAT TAC AAG CTG 1632 
Thr Asp Pro Asp Vai He Ala Mec Ala Asp Pro Mec His Tyr Lys Leu 
530 535 540 



15 



35 



40 



50 



55 



GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC CGA GGC GAC AGC 1680 
Ala lie Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 
545 550 555 560 



20 GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA GCC AAA ATG TAC 1723 
Aia Tyr Arg Gin Leu Giu Arg Asp Thr Leu Vai Giu Aia Lys Mec Tyr 

565 570 575 

TAC ATT CAG GCA C.^A CAG CTA CTG GGA CCG CGC CCT GAT ATC CAT ACC 1776 
25 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 

580 585 590 

ACC AAT ACT TGG CCA AAT CCC ACC TTG ACT AAA GAA GCT GGC GCT ATT 1824 
Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Giu Ala Gly Ala He 
30 595 600 605 

GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG ACG TTC GCT GCC 1872 

Ala Thr Pro Thr Phe Leu Ser Ser Pro Giu Vai Met Thr Phe Aia Aia 
610 615 620 



TGG CTA AGC 1331 

Trp Leu Ser 

625 



(2) INFORMATION FOR SEQ ID NO:28: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 627 amino acids 
45 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Mec Ser Giu Ser Leu Phe Thr Gin Thr Leu Lys Giu Aia Arg Arg Asp 
15 10 15 

Aia Leu Vai Ala His Tyr He Ala Thr Gin Vai Pro Ala Asp Leu Lys 

20 25 30 



Giu Ser He Gin Thr Ala Asp Asp Leu Tyr Giu Tyr Leu Leu Leu Asp 
60 35 40 45 

Thr Lys He Ser Asp Leu Vai Thr Thr Ser Pro Leu Ser Giu Aia He 
50 55 60 

65 Gly Ser Leu Gin Leu Phe He His Arg Aia He Giu Gly Tyr Asp Gly 
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^' ^S 5a 

Thr Leu Ala Asp Ser Ala Lys Pro Tyr Ph« Ala Asp Glu Gin Phe Uu 
5 " 90 95 

r/r Asn Trp Asp Ser Phe Asn His Arg T/r Ser Thr Trp Ala Gly L-s 

100 105 

Clu Arg Leu Lys Phe r/r Ala Gly Asp Tyr lie Asp Pro Thr Leu Arg 
Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 

15 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr L-u 
*^ 150 155 ^.^ 

He ser Tyr Asp Thr Leu Ala Thr Leu Asp lyr He Thr Ala Cys Gin 
20 175 

Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 

130 185 190 

Pro Ti-r Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 

200 205 

Lys Leu Lys Pro Asp Gin Trp ser Glu Trp Arg Ala lie Asn Ala Gly 

215 220 

30 lie ser Glu Ala Tyr Ser Gly His Vai Glu Pro Phe Trp Giu Asn Asn 

235 240 
Lys Leu His lie Arg Trp Phe Thr lie Ser Lys clu Asp Lys lie Asp 
35 ^" 250 255 

Phe val xyr Lys Asn lie Trp Val Mec Ser Ser Asp Tyr Ser Trp Ala 

265 270 

ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
* ' 280 285 

Gly Ala Thr Gly ser Ser Ser Pro Thr Glu Val Ala Ser Gin Ti-r Gly 

295 300 

45 ser Asp Ala Gin Mec Asn He Ser Asp Asp Gly Thr Val Leu He Ph- 

^10 315 320 

Gin Asn Ala Gly cly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 
50 330 

Asp ser Giy Asn Val He Lys Asn Leu Ser Ser Thr Gly ser Ala Asn 

*° 345 350 

Leu ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met cys His Gly 
355 360 355 " 

Gin ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 
370 375 

W) Thr He Glu Phe Thr Ser T,'r Gly Thr Phe Ser Ser Asp Gly Lys Gin 

395 400 

Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn iN-r 
65 "° 415 

Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 
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42 



425 



450 



r/r Asp Val Gin Gly Gin Phe g1/ Gly 5er Asn Pro Val Asp Asn fh- 

•*35 440 445 

ser Giy Pro Tyr Gly He T/r Leu Trp Glu He Phe Phe His lie Pro 
450 ds=5 



460 



Phe Leu val Thr Val Arg Met Gin Thr Glu Gin Arg T-/r Glu Asp Ala 

4/5 430 

Asp Thr Trp T/r Lys P/r lie Phe Arg Ser Ala Gly T-/r Arg Asp Ala 

485 490 aoc 



15 Asn Gly Gin Leu lie Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 



510 



Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 



20 



520 



525 



Thr Asp Pro Asp Val lie Ala Met Ala Asp Pro Met His Tyr Lys Leu 



530 535 



25 545 



Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp ser 



550 



555 



560 



Aia Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Me^ T-r 

565 570 



30 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He 



580 



585 



575 



His Thr 



590 



Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Aia He 



35 



600 



605 



Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 



Trp Leu Ser 
40 625 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1689 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

IB) LOCATION: 1..1689 

(D) OTHER INFORMATION: /products "SB" 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GCA GGC GAT ACC GCA .AAT ATT CGC GAG GGT CAT TTC TTG CCA C-G TAC 43 

-la Giy Asp Thr Aia Asn He Gly Asp Giy Asp Phe Leu Pro P^ Sr 
/ 5 10 15 ^ 
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5 



10 



A.i.C 3 AT GTA CTA CTC ZCT TAC TZO GAT CTT GAG TTA CGC CTA T^'* M 

Asn Asp Val Leu Leu Gl/ T/r Trp Asp Lys Leu Giu Leu Arg Leu Tvr 

^0 25 30 

AAC CTC CGC CAC AAT CTC ACT CTC GAT GGT CAA CCG CTA AAT CTC 144 
Asn Leu Arg His Asn Leu 3er Leu Asp Giy Gin Pro Leu Asn Leu Pro 
^5 40 45 

CTG TAT GCC ACG CCG GTA GAC CCG AAA ACC CTG CAA CGC CAG CAA G^c 1-2 
Leu r/r Aid Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala ' 
50 55 60 

GGA CGG CAC GGT ACA GCC ACT ACT CCG GCT GGT GGT CAA GCC ACT GTT -4 0 
Gly Gly Asp Giy Thr Giy Ser Ser Pro Aia Gly ciy Gin Giy Ser Val " 

75 ao 

CAG GGC TGG CGC TAT CCG TTA TTG GTA CAA CGC GCC CGC TCT GCC GTG 233 
Gin Gly Trp Arg Tyr Pro Leu Leu Vai Glu Arg Aia Arg Ser Aia Val 

85 90 95 

ACT TTG TTG ACT CAG TTC GGC A.AC ACC TTA C.\A ACA ACG TTA CAA CAT 33 6 
Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His 

100 105 UO 



15 



25 



30 



35 



40 



45 



50 



55 



60 



65 



CAG GAT AAT GAA AAA ATG ACG ATA CTG TTG CAG ACT CAA CAG GAA G^C 
Gin Asp Asn Giu Lys Met Thr lie Leu Leu Gin Thr Gin Gin Giu Aia 
115 120 125 



334 



ATC CTG AAA CAT CAG CAC GAT ATA CAA CAA AAT AAT CTA AAA GGA TTA 432 
He Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu Lys Gly Leu 
1^0 135 140 

CAA CAC ACC CTG ACC CCA TTA CAG GCT ACC CGT GAT GGC GAC ACA TTG 480 
Gin His Ser Leu Thr Aia Leu Gin Aia Ser Arg Asp Giy Asp Thr Leu 

150 155 160 

CGG CAA AAA CAT TAC AGC GAC CTG ATT AAC GGT GGT CTA TCT GCC GC^ S^^S 
Arg Gin Lys His Tyr Ser Asp Leu He Asn Giy Gly Leu Ser Ala Ala 

165 170 175 

GAA ATC GCC GGT CTC ACA CTA CGC AGC ACC GCC ATG ATT ACC AAT GGC 57 
Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 

180 185 190 

GTT GCA ACG GGA TTG CTG ATT GCC GGC GGA ATC GCC AAC CCG GTA CCT 624 
Val Ala Thr Giy Leu Leu He Ala Giy Giy He Aia Asn Aia Vai Pro 
195 200 205 

AAC CTC TTC GGG CTG GCT AAC GGT GGA TCG GAA TGG GGA GCG CCA TTA 672 
Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Giu Trp Gly Ala Pro Leu 
210 215 220 

ATT GGC TCC GGG CAA GCA ACC CAA GTT GGC GCC GGC ATC CAG GAT CAG "--O 
lie Giy Ser Giy Gin Ala Thr Gin Vai Giy Aia Giy He Gin Asp Gin 
225 230 235 240 

AGC GCG GGC ATT TCA GAA GTG ACA GCA GGC TAT CAG CGT CGT CAG GAA 7 63 
Ser Ala Gly He Ser Giu Val Thr Aia Giy Tyr Gin Arg Arg Gin Giu 

245 250 255 

G.AA TGG GCA TTG CAA CGC GAT ATT GCT GAT AAC GAA ATA ACC CAA CTG 31c 
Giu Trp Aia Leu Gin Arg Asp He Aia Asp Asn Glu He Thr Gin Leu 

260 265 270 

GAT GCC CAG ATA CAA AGC CTG CAA GAG CAA ATC ACG ATG GCA CAJ^ AAA 364 
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.-.sp Ala Oin Cin S«r Leu sin Glu Gin lis Thr Mec Ala ::ln L/s 

S tr '^'^■^ ^^"^ --^^ AAT GCC CAA GCG ATT TAT 

5 .in ru Thr L«u ser Glu Thr Glu Gin Ala Asn Ala Gin Ala ill 



OAC CTG CAA ACC ACT CCT TTT ACC GGG CAC GCA CTC T^T AAC TGG -Tr 
ASP Leu Gin Thr Thr Arg Phe Thr Cly Gin Ala lIS JJn Ifp ?Zr 

515 



GO- 3GT CGT CTC TCC GCC CTC TAT TAC CAA ATS TAT CAT TCr- ACT CTG 1003 

Ala Gly Arg Leu 3er Ala Leu Tyr Tyr Gin Met T,r Asp Sr S III 
15 335 

CCA ATC TGT CTC CAG CCA AAA GCC CCA TTA GTA CAG CAA TTA cru- r^n 

Pre n. cys Leu Gin Pro Lys Ala Ala Leu VaJ C^n "J oiu ' 



350 



20 



40 



AAA GAG AGC GAC ACT CTT TTC CAG CTT CCG GTC TCG AAT GAT ^TC TCC M 
Lys clu ser Asp Ser Leu Phe Gin Val Pro 7al Trp Isl S 

360 365 

CXA CGG CTG TTA GCA CCA GAA CGT TTA ACT TCA GAC CTA TAG AAJV /--rv- , i.., 
25 Gin Gly Leu Leu Ala Gly Glu Cly Leu Ser Ser ctS lII oTn JJ^t l2 

375 380 

'If f?*^ '^'^ GOT ATT GGG CTA GAA GCC ATC C-C 1200 

30 385 '"'^ lie A^g ' ° 

?S ?I? f'^ f ^ ^w"" ^ GCC ACA GGG ACC TTA AGT GAA AAT 1243 

Thr val ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 

35 415 

ATC AAT AAA CTG CTT AAC GGG CAA ACG GTA TCT CCA TCC GGT GGC GTr i->oa 
He Asn Lys Val Leu Asn Gly Glu Thr Val s.r Pro S Sy §?y S 

425 

S S S I?? ti; Z l-S, S ?S 2S Z 

440 445 

45 IT. qj S'.Xi^.^.z^.iz fT. t^i jj= ^ - 

455 460 

CGT ATT AAA COT ATC CCC GTC ACC CTG CCA ACA CTT CTG GGG CCA TIT Id -in 

50 ^65 """^ ^'t ^""^ '-^^ ^" S ^-/r "° 

475 430 

G?J A=I nt* f?^ ^^"^ """^ ^■^'^ GCG GAA ATC GCC GCC TTA 1483 

Gin ASP Leu Glu Ala Thr Leu Val Mec Gly Ala Glu He Ala Ala Leu 

55 <90 495 

TCA CAC GGT CTC AAT CAC GCA GGC CGG TTT GTT ACC GAC TTT air r^r .c-,- 
ser His Gly Val Asn Asp cly Gly Arg Ph. vH Sr Sp Asp '"^ 

505 

AGC CGT TTT CTG CCT TTT GAA GGT CGA GAT GCA ACA ACC GGC ACA CTC 15fld 
ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Sy i^r lIu 

520 525 

GAG CTC AAT ATT TTC CAT GCC GGT AAA GAG GGA ACG CAA CAC CAG fra isi, 
Giu Leu Asn lie Phe His Ala Gly Lys Glu Cly Thr cJJ 5ts gJS JIS 
' 535 540 



60 



65 
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10 
IS 

20 

25 
30 

35 
40 
45 
50 

55 
60 

65 



-TC GCC .^JkT CTO ACT GAC AT? ATT GT3 CAT CTG A.^T TAC ATC ATT :GA lii^'- 

Val Aid Asn Leu s^r Asp lie lie Vai Kis Leu Asn T/r lie lie Ar? 

545 S50 555 cc': 

GAC GCG TAA 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 563 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Aia Giy Asp Thr Aid Asn lie Giy Asp Gly Asp Phe Leu Pro Pro T/r 
^5 10 15 

Asn Asp Vai Leu Leu Giy T/r Trp Asp Lys Leu Glu Leu Arg Leu Tyr 

20 25 30 

Asn Leu Arg His Asn Leu Ser Leu Asp Giy Gin Pro Leu Asn Leu Pro 
35 40 45 

Leu Ti'r Aia Thr Pro Vai Asp Pro Lys Thr Leu Gin Arg Gin Gin Aia 
50 55 60 

Giy Giy Asp Giy Thr Giy Ser Ser Pro Ala Giy Giy Gin Giy Ser Vai 
S5 70 75 80 

Gin Giy Trp Arg Tyr Pro Leu Leu Vai Glu Arg Ala Arg Ser Ala Val 

85 90 95 

Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His 

100 105 110 

Gin Asp Asn Glu Lys Met Thr lie Leu Leu Gin Thr Gin Gin Glu Ala 
115 120 125 

lie Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu Lys Giy Leu 
130 135 X40 

Gin His Ser Leu Thr Ala Leu Gin Aia Ser Arg Asp Gly Asp Thr Leu 
145 150 155 160 

Arg Gin Lys His Tyr Ser Asp Leu lie Asn Gly Gly Leu Ser Ala Aia 

165 170 175 

Giu lie Aia Giy Leu Thr Leu Arg Ser Thr Aia Met lie Thr Asn Giy 

180 185 190 

Vdi Aia Thr Giy T.eu Leu lie Ala Giy Giy lie Ala Asn Ala Vai Pro 
195 200 205 

Asn Vai Phe Giy Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 
210 215 220 

lie Giy Ser Gly Gin Ala Thr Gin Val Giy Ala Giy lie Gin Asp Gin 



Asp Ala 
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235 



2A 



ser Aia Gly He Ser Glu Val Thr Ala '^ly Tyr Gin Arg Arg Gin Glu 
5 250 255 

Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 



265 



270 



lU 



ASP Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 



285 



Gin lie Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He ryr 



15 ASP Leu Gin Thr Thr Ar. Phe Thr Gly Gin Ala L.u T/r Asn Trp Mec 

315 •> -> A 



20 



Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser 

325 



320 

Thr Leu 
335 



Pro He cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu 

345 -jen 



Lys Glu Ser Asp Ser Leu Phe Gin Val 



355 



360 



350 

Pro Val Trp Asn Asp Leu Trp 

365 



Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 

J 5 380 



30 ASP Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arc, 

400 



35 



Thr val ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 



410 



415 



40 



lie Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 

425 430 

Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr 



435 



440 



Leu Asp Leu Ser 
445 



Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Ly 



460 



s Arg 



45 Ar| He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 

475 480 



50 



Gin ASP Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 



495 



ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 



505 



510 



ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 

520 525 

Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 

535 540 

60 Val Ala Asn Leu Ser Asp He He Val His Leu Asn T/r He He Arg 



555 



560 



Asp Aia 



65 
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^21 ::iF0P.14ATICN FOR 3EQ 10 NO:.-!: 

JiJ SEQUENCE CHARACTERISTirc. 

LENGTH: 4453 base pairs 

(B) TYPE: nucleic acid 

(C) STPANDEDNESS : double 

(D) TOPOLOGY: linear 

'.ii) MOLECULE TYPE: DNA (genomic) 

tix) FEATURE: 

(A) NAME/KEY: CDS 

iB) LOCATION: 1..4458 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



2v ^ ir. in ^ 5it s - - - ™ - - ccc 



43 



5 10 15 



I" If. m 51? s s? - s s - s - - 

30 " 45 

q ^s? s s s s - - - s - - - 

55 60 



^ S5 S SJi S^I S5 - - - ^ - - - - 

85 90 95 



45 



lIu' H; lit ;ij ji? ?r ?f ^^^^^ 

105 

IV. Ji: - SI JJ? ?s ir. is - 

50 "° 125 

Pro lH CAA ccc ccc CAG ATC CTG GAT 432 

''^^ ^''^ Arg TVr Gin AU Arg cin He Leu Asp 

^ r° "^^^ ^''^ '^'^ 430 

145 ]Vr Trp Gin Pro Ala Ser Gly cin Clu Cly Arg 

155 

fiO S ?rp ^IS J" f ^ ^-^^ ATC TTA GGG S23 

»rp uau lie Ser Thr Pro Asp Gly His Leu His lie Leu Gly 
165 170 

ill f:° '^'^ AAT GAC CAA CAA ATC 5-5 

L,s Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp cin nl 



55 
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ISv 



130 



10 



15 



id 



20 



sec CAG T3G TTG CTC CAA GAJ\ ACT GTG .kCO CCA GCC 3GT GAA '"nT GT-- o'J 

Ala Gin Trp Leu Leu Olu Giu Thr Val Thr Pro aIa g1/ gIu His Val * 
135 200 205 

AOC TAT CA.A TAT CGA GCC GAA GAT CAA GCC CAT TGT GAC GAC AAT GAA 
.?«r T/r Gin T/t Arg Ala Glu Asp Glu Ala His C/s Asp Asp Asn Clu 

215 220 

-AAA ACC GCT CAT CCC .\AT GTT ACC CCA CAG CGC TAT CTG GTA CAG GTG "Vt 
Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Sn vl? ' 

230 235 

.'AC TAC GGC AAC ATC AAA CCA CAA GCC AGC CTG TTC GTA CTG GAT i-'- 
Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 

245 250 255 

GCA CCT CCC GCA CCG GAA GAG TCG CTG TTT CAT CTC GTC TIT G\C 316 
Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 

2fiO 265 270 

Clu Ar^ f«o tI" I" f"^ ^^"^ """^ '^^^ GAT OCA GOT 864 

CI/ Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 
-• ' ''5 280 285 

ACA GCC CAA TGG TCT GTA CGC CCG GAT ATC TTC TCT CGC TAT GAA TAT 912 
Thr Ala Gin Trp Ser Val Arg Pro Asp lie Phe Ser Arg Sr cJI 

GGT TTT CAA GTG CGT ACT CGC CGC TTA TCT CAA CAA GTC CTC ATC riT 960 
Gly Phe Glu val Arg Thr Arg Arg Leu Cys Gin Gin Val Le5 SI? 5^ 

310 

iu^ GAA GCC AGT ACC AAT GAC GCC CCG 1003 

HIS Arg Thr Ala Leu Mec Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 

325 335 

^i^ ?ir s?; ri" ^ gtc lose 

Giu Leu val Gly Arg Leu He Leu Clu Tyr Asp Lys Asn Ala Ser Val 

345 350 

ii'r Thr ^T*" GAA TCG GAC GGG AGG 1104 

Thr Thr Leu He Thr He Arg Gin Leu Ser Hi* Glu Ser Asp Gly Arg 

360 365 
Pro vl? ^11 lit F^'^ 5^ P'^ TGG CAA CCG rrr GAT CTC U 52 

Gin 

J30 



30 



35 



40 



50 



55 



Pro vkl Thr r^^Z » ^ . ^ , TOG CAA CCG TTT GAT CTC 

Pro val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 

375 

CAC AAA ATC CCC ACA TGG CAA CGC TTT GAC GCA CTA GAT AAT TTT AAC 1200 
Clu Lys He Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe 

390 395 400 

llr rtn .^"^ "^^"^ ^"^ G*"^ GAA GGG rro CCA 1248 

ser Gin Gin Arg Tyr Gin Leu val Asp Leu Arg Gly Glu Gly Leu Pro 

405 410 
GGT ATG CTC TAT CAA GAT CGA GGC OCT TGG TCG TAT AAA GOT CCC CAA 129 6 

Gly Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lv^ pro 

420 425 430 

Arl rtn n'^ t'"^ A^"^ TAC GAC AAA ATC GCC 134 4 

Arg Gin G u Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys He Ala 
4->5 440 445 



60 
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30 



40 



45 



50 



60 



65 



- -.G rCT Art CTA CCC PLnT TTG CAG GAT AAT GCC TCA TTG ATG GAT ' - 
Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn AU Set Leu Mec ^sd 
450 455 460 

5 ATC .\.:VC GCA CAC GGC CAA CTG GAT TGG GTT GTT ACC GCC TCC GGT ^TT 144." 
Ile Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Giy lie 
^^■5 470 475 430 

CGC GCA TAC CAT AGT CAG CAA CCC GAT CGA AAG TGG ACG CAC TTT \ro -433 

Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Pho Thr 

435 490 495 

CCA ATC AAT GCC TTG CCC CTG GAA TAT TTT CAT CCA AGC ATC CAG TT- 15's6 
Pro lie Asn Aia Leu Pro Val Glu Tyr Phe His Pro Ser lie Gin Phe 
13 500 505 5X0 

GCT GAC CTT ACC GGG GCA GGC TTA TCT GAT TTA GTC TTC ATC GGG CCG i584 
Aid Asp Leu Thr Gly Aia Gly Leu Ser Asp Leu Val Leu lie Giy Pro 
515 520 525 

AAA AGC GTG CGT CTA TAT GCC AAC CAG CGA AAC GGC TGG CGT AAA GGA 1632 

Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 
530 535 540 

GAA GAT GTC CCC CAA TCC ACA GGT ATC ACC CTG CCT GTC ACA GGG ACC 1630 
Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr 
545 550 555 560 

GAT GCC CGC AAA CTG GTG GCT TTC AGT GAT ATG CTC GGT TCC GGT CAA 1728 
Asp Ala Arg Lys Leu Val Aia Phe Ser Asp Met Leu Gly Ser Gly Gin 

565 570 575 



CAA CAT CTG GTG GAA ATC AAG GGT AAT CGC GTC ACC TCT TGG CCG AAT 1776 

Gin Hxs Leu Val Glu lie Lys Giy Asn Arg Val Thr Cys Trp Pro Asn 
«53 580 585 590 



CTA GGG CAT GCC CGT TTC CGT CAA CCA CTA ACT CTC TCA GGA TTT AGC 1324 
Leu Giy His Gly Arg Phe Giy Gin Pro Leu Thr Leu Ser Gly Phe Ser 
595 600 605 

CAG CCC GAA AAT AGC TTC AAT CCC GAA CGG CTC TTT CTC GCG GAT ATC 1372 
Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Aia Asp lie 
610 615 620 

CAC GGC TCC GGC ACC ACC GAC CTT ATC TAT GCG CAA TCC GGC TCT TTC 1920 

Asp Gly Ser Gly Thr Thr Asp Leu lie Tyr Ala Gin Ser Gly ser L»u 
625 630 635 640 

CTC ATT TAT CTC AAC CAA AGT GGT AAT CAG TTT CAT CCC CCG TTC ACA 1963 
Leu He Tyr Leu Asn Gin Ser Giy Asn Gin Phe Asp Aia Pro Leu Thr 

645 650 655 



TTA GCG TTC CCA GAA GGC CTA CAA TTT GAC AAC ACT TCC CAA CTT CAA 2015 
Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 

660 665 670 



GTC GCC GAT ATT CAG GGA TTA GGG ATA GCC AGC TTC ATT CTC ACT GTG 2064 
Val Ala Asp He Gin Gly Leu Gly lie Aia Ser Leu He Leu Thr Val 
675 680 635 

CCA CAT ATC GCG CCA CAT CAC TGG CGT TCT GAC CTG TCA CTG ACC AAA 2ii2 

Pro His He Aia Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys " " 
690 695 700 

CCC TGG TTC TTC AAT CTA ATC AAC AAT AAC CGG GCC CCA CAT CAC ACG 2160 
Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 
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'05 710 7X5 

CTA CAT TAT CGT ACT TCC GCG CAA 7TC TOG TTG GAT GAA AAA TTA CAG 2j03 

LeU His T/r Arg S«r Sdr Ala Gin Phe Trp L«u Afip Giu Lys L«u Gin 

725 ":0 "35 

CTC ACC AAA GCA GGC TCT CCG GCT TCT TAT CTG CCG TTT CCA ATG 2256 

Leu Thr Lys Ala Giy L/s Ssr Pro Ala C/s P/r L»u Pro Phd Pro McC 

740 745 750 

CAT TTG CTA TGG TAT ACC GAA ATT CAG GAT GAA ATC AGC GGC AAC CGG 23 04 
His Leu Leu Trp Tyr Thr Giu lie Gin Asp Giu lie Ser Giy Asn Arg 
755 760 765 

15 CTC ACC AGT GAA GTC .=iAC TAC AGC CAC GGC GTC TGG GAT GGT AAA GAG 2352 

Leu Thr Ser Giu Val Asn Tyr Ser His Giy Vai Trp Asp Giy Lys Giu 
770 775 780 

CGG GAA TTC AGA GGA TTT GGC TGC ATC AAA CAG ACA GAT ACC ACA ACG 2400 

20 Arg Giu Phe Arg Giy Phe Giy Cys He Lys Gin Thr Asp Thr Thr Thr 

785 790 7S5 300 

TTT TCT CAC GGC ACC GCC CCC GAA CAG GCG GCA CCG TCG CTG AGT ATT 2443 
Phe Ser His Giy Thr Ala Pro Giu Gin Ala Ala Pro Ser Leu Ser lie 
25 805 810 815 



30 



50 



AGC TGG TTT CCC ACC GGC ATG GAT GAA GTA GAC AGC CAA TTA GCT ACG 2496 
Ser Trp Phe Ala Thr Giy Met Asp Giu Val Asp Ser Gin Leu Ala Thr 

820 825 830 

GAA . TAT TGG CAG GCA GAC ACG CAA GCT TAT AGC GGA TTT GAA ACC CGT 2544 

Giu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Giy Phe Giu Thr Arg 
835 840 845 



35 TAT ACC GTC TGG GAT CAC ACC AAC CAG ACA GAC CAA GCA TTT ACC CCC 2592 

Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 
850 655 860 

AAT GAG ACA CAA CGT AAC TGG CTG ACG CGA GCG CTT AAA GGC CAA CTG 2640 
40 Asn Giu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Giy Gin Leu 
865 870 875 330 

CTA CGC ACT GAG CTC TAC GGT CTG GAC GGA ACA GAT AAG CAA ACA GTG 2688 
Leu Arg Thr Giu Leu Tyr Giy Leu Asp Giy Thr Asp Lys Gin Thr Vai 
45 885 890 895 

CCT TAT ACC GTC AGT GAA TCG CGC TAT CAG GTA CGC TCT ATT CCC GTA 2736 
Pro Tyr Thr Val Ser Giu Ser Arg Tyr Gin Val Arg Ser He Pro Val 

900 905 910 



AAT AAA GAA ACT GAA TTA TCT GCC TCG GTG ACT GCT ATT GAA AAT CGC 2734 
Asn Lys Giu Thr Giu Leu Ser Ala Trp Val Thr Ala lie Giu Asn Arg 
915 920 925 



55 ACC TAC CAC TAT GAA CGT ATC ATC ACT GAC CCA CAG TTC AGC CAC AGT 233 2 
Ser Tyr His Tyr Giu Arg lie He Thr Asp Pro Gin Phe Ser Gin Ser 
930 935 940 

ATC AAG TTG CAA CAC GAT ATC TTT GGT CAA TCA CTG CAA AGT GTC GAT 2880 
fiO lie Lys Leu Gin His Asp lie Phe Giy Gin Ser Leu Gin Ser Vai Asp 
945 950 955 960 

ATT CCC TGG CCG CGC CGC GAA AAA CCA GCA GTG AAT CCC TAC CCG CCT 2923 

lie Ala Trp Pro Arg Arg Giu Lys Pro Ala Val Asn Pro Tyr Pro Pro 
65 965 970 975 
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ATT :TG ZZZ Z.kA ACO CTA TTT 'Zk-Z XZC ACT TAT 3AT OAT CA.A ZK\ ZK-. -/'■S 
Th.r Leu Pro Glu Thr Leu Phe Asp ier oer T/r Asp Asp Gin Gin :;in 

960 38S 990 

5 CTA TTA CCT CTC GTG AC A CA-A .AAA AAT AGC TGG CAT CAC CTG ACT GAT 3 0Z4 
Leu Leu Ar? Leu Val Ar? Gin Lys Asn 3er Trp His His Leu Thr Asp 
995 1000 1005 

GGG GAA AAC TGG CGA TTA GGT TTA CCG AAT GCA CAA CGC CCT GAT GTT 3072 
W Oly Glu Asn Trp Arg Leu Giy Leu Pro Asn Aid Gin Arg Arg Asp Val 
ICXO 1015 1020 

TAT ACT TAT GAC CGG AGC AAA ATT CCA ACC GAA CGC ATT TCC CTT GAJv J 120 

T/r Thr Tyr Asp Arg Ser Lys lie Pro Thr Glu Giy lie Ser Leu Giu 
15 1025 1030 1035 1040 

ATC TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA CCG GCC GTT 3163 
lie Leu Leu Lys Asp Asp Giy Leu Leu Ala Asp Glu Lys Aid Ala Vai 

1045 1050 1055 



20 



40 



60 



TAT CTG GGA CAA CAA CAC ACC TTT TAC ACC GCC CCT CAA CCG GAA CTC 3216 
Tyr Leu Giy Gin Gin Gin Thr Phe Tyr Thr Ala Giy Gin Ala Glu Val 

1060 1065 1070 



25 ACT CTA GAA AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3264 
Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 
1075 1080 1085 

GCC ATG ATG GAC GAT ACC TCA TTA CAC GCC TAT CAA GGC CTC ATT GAA 3312 
30 Aid Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Ciu Cly Val lie Glu 
1090 1095 1100 

GAC CAA GAG TTG AAT ACC CCC CTC ACA CAC CCC CCT TAT CAC CAA CTC 33 60 

Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Cly Tyr Cln Gin Val 
35 1105 1110 1115 1120 

GCC CCG TTC TTT AAT ACC ACA TCA CAA ACC CCG GTA TGG GCG CCA CGG 3408 
Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 

1125 1130 1135 



CAA GGT TAT ACC GAT TAC GGT GAC GCC GCA CAC TTC TGG CGG CCT CAG 3456 
Gin Giy Tyr Thr Asp Tyr Giy Asp Ala Ala Gin Phe Trp Arg Pro Gin 

1140 1145 1150 



45 GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3 504 
Ala Gin Arg Asn Ser Leu Leu Thr Giy Lys Thr Thr Leu Thr Trp Asp 
1155 il60 1165 

ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3 552 
50 Thr His His Cys Val lie lie Gin Thr Cln Asp Ala Ala Giy Leu Thr 
1170 1175 1130 

ACG CAA GCC CAT TAC CAT TAT CCT TTC CTT ACA CCG GTA CAA CTG ACA 3 600 

Thr Gin Ala His Tyr Asp T/r Arg Phe Leu Thr Pro Val Gin Leu Thr 
55 1185 1190 1195 1200 

GAT ATT AAT CAT AAT CAA CAT ATT CTG ACT CTG CAC CCC CTA CGT CGC 3 648 
Asp He Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Cly Arg 

1205 1210 1215 



GTA ACC ACC AGC CGG TTC TGG GGC ACA GAC GCA GCA CAA CCC GCA GGC 3 69 6 
Val Thr Thr Ser Arg Phe Trp Giy Thr Glu Ala Giy Gin Ala Ala Giy 

1220 1225 1230 



65 TAT TCC AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG CTG 3744 
T/r Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 



wo 97/1 7432 PCTAJS96/1 8003 

1235 1240 124S 

GCA TTA ACC CCC GCA CTC CCT GTT CCC C.^U TCT TTA 3TC TAT GCr CTT 
Ala Leu Thr Gly Aid Leu Pro Vai Ala Gin cvs Leu Val T/r Ala Val 
5 1250 1255 1260 

GAT AGC TCC ATC CCC TCG TTA TCT TTG TCT CAG CTT TCT CAG TCA CAA 3340 
Asp Ser Trp Met Pro Set Leu Ser Leu Ser Gin Leu Ser Gin Set Gin 
1265 1270 1275 1230 



10 



15 



20 



30 



35 



40 



50 



55 



60 



65 



G.i.:^ ZkZ GCA GAA GCG CTA TGG JCG C.\A CTG CGT GCC GCT CAT ATC ATT 3 333 
Giu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met lie 

1235 1290 1295 

ACC GA.A GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA ACA AGC 393 6 
Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lyc Arg Gly Thr Ser 

1300 1305 1310 

CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA ACT ATT CCC 3934 
His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser lie Pro 
1315 1320 1325 



CGT TTA CCC CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4 03 2 
Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 
-5 1330 1335 1340 



GAT CCC CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT ACT GAC GGT TTT 4030 
Asp Pro Gin Gin Gin His Gin Gin Thr Vai Ser Phe Ser Asp Gly Phe 
^^45 1350 1355 1360 

GCC CGC TTA CTC CAG ACT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4123 
Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 

^365 1370. 1375 

CAA CGT AAA GAG GAT CGC GGG CTC CTC GTG GAT GCA AAT GGC GTT CTG 41-6 
Gin Arg Lys Glu Asp Gly Gly Leu Val Vai Asp Aia Asn Gly Val Leu 

1380 1385 1390 

CTC ACT GCC CCT ACA GAC ACC CGA TGG GCC CTT TCC CGT CCC ACA GAA 4224 
vai ser Aia Pro Thr Asp Thr Arg Trp Aia Val Ser Gly Arg Thr Glu 
1395 1400 1405 



TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4272 
Tyr Asp Asp Lys Gly Gin Pro Vai Arg Thr Tyr Gin Pro Tyr Phe Leu 
45 14i0 1415 1420 



AAT CAC TCG CGT TAG GTT ACT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4320 
Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Aia Arg Asp Asp Leu Phe 
1425 1430 1435 1440 

GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CCC GAA TAC AAA GTC ATC 43 63 
Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val He 

1445 1450 1455 

ACT GCT AAC AAA TAT TTG CGA GAA AAC CTG TAC ACC CCG TGG TTT ATT 4416 
Thr Ala Lys Lys Tyt Leu Arg Giu Lys Leu Tyr Thr Pro Trp Phe He 

1460 1465 1470 

GTC ACT GAG GAT GAA AAC GAT ACA GCA TCA AGA ACC CCA TAG 4453 
Val Ser Glu Asp Giu Asn Asp Thr Aia Ser Arg Thr Pro • 
1475 1430 1435 

<2) INFORMATION FOR SEQ ID NO: 32: 
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40 



55 



'II SEQUENCE CHAF.^CTERISTICS; 

(A/ LENGTH: i4 86 amino acidi 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Gin Asp Ser Pro Glu Val Set He Thr Thr Leu Ser L*u Pro Lys 
15 10 15 



Gly Gly Gly Ala He Asn Gly Met Ciy Glu Ala Leu Asn Ala Ala Gly 

15 20 25 30 

Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro L«u Ser Thr Gly 

35 40 45 

20 Arg Gly Thr Ala Pro Gly Leu Ser Leu He Tyr Ser Asn Ser Ala Gly 

50 55 60 



Asn Gly Pro Phe Gly He Gly Trp Gin Cys Gly Val Met Ser He 5er 

65 70 "5 30 

Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 

85 90 95 



Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 
30 100 105 HO 

Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 
115 120 125 

35 Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin ^Ile Leu Asp 
130 135 140 



Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 

i45 150 155 160 

Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly 

165 170 175 



Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 
45 ISO 185 190 

Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 
195 200 205 

50 Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 
210 215 220 



Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Vai 

225 230 235 240 

Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 

245 250 255 



Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 

260 265 270 

Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 

275 280 285 

65 Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg Tyr Glu T/r 
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250 295 300 

Cly Ph® Clu VAl Arg Thr Arg Arg Leu Cys Gin Gin val Lau Mat Phe 
5 3iO 315 

His Arg Thr Aia Leu Mat Aia Giy Clu Aid Ser Thr Asn Asp Aia Pro 

325 330 335 

Giu Lau Vai Giy Arg Lau lia Lau Ciu Tyr Asp Lys Asn Ala ser Val 

340 345 350 

Thr Thr Lau Ila Thr Ila Arg Gin Lau Sar His Giu Ser Asp Giy Arg 
355 360 355 

15 Pro yal Thr Gin Pro Pro Leu Giu Leu Aia Trp Gin Arg Phe Asp Lau 
3'0 375 380 

Giu Lys lie Pro Thr Trp Gin Arg Pha Asp Aia Leu Asp Asn Phe Asn 
20 "° 400 

Ser Gin Gin Arg Tyr Gin Lau Vai Asp Lau Arg Cly Giu Giy Leu Pro 

405 4io 415 

Giy Mat Leu TVr Gin Asp Arg Giy Ala Trp Trp Tyr Lys Aia Pro Gin 

420 425 

Arg Gin Clu Asp Giy Asp Ser Asn Ala Vai Thr Tyr Asp Lys lie Ala 
435 440 445 

30 Pro Leu Pro Thr Leu Pro Asn Lau Gin Asp Asn Ala Ser Leu Met Asp 
4'" 455 

lie Asn Giy Asp Giy Gin Leu Asp Trp Vai Vai Thr Aia Ser Giy lie 
35 475 480 

Arg Giy Tyr His ser Gin Gin Pro Asp Giy Lys Trp Thr His Phe Thr 

485 490 495 

Pro lie Asn Ala Leu Pro Vai Giu Tyr Pha His Pro Ser lie Gin Phe 

500 505 510 

Ala Asp Lau Thr Giy Ala Giy Lau Sar Asp Leu Vai Leu lie Giy Pro 
515 520 525 

45 Lys Ser Vai Arg Leu Tyr Ala Asn Gin Arg Asn Giy Trp Arg Lys Giy 
530 535 5^Q 

Clu Asp Vai Pro Gin Ser Thr Giy lie Thr Leu Pro Vai Thr Giy Thr 
50 "0 555 560 

Asp Ala Arg Lys Leu Vai Aia Phe Ser Asp Met Leu Giy Ser Giy Gin 

565 570 575 

Gin His Leu Vai clu lie Lys Giy Asn Arg Vai Thr Cys Trp Pro »sn 
" 580 585 590 

Leu Giy His Giy Arg Phe Giy Gin Pro Leu Thr Leu Ser Giy Phe Ser 
535 600 605 

Gin Pro Giu Asn ser Phe Asn Pro Giu Arg Leu Phe Leu Aia Asp He 
"10 515 620 

Asp Giy Ser Cly Thr Thr Asp Leu lie Tyr Ala Gin Ser Giy Ser Leu ' 

630 635 640 

Leu lie Tyr Leu Asn Gin Ser Cly Asn Gin Phe Asp Ala Pro Leu Thr 
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645 



c55 



L.U Ala Leu Pro ciu ai, val ^in Phe Asp Asn Thr c,s Cln L.u ain 

val Ala ASP Ua Gin Oly Leu Cly He Ala S« L,u II, Leu Thr V.l 

o30 685 



10 



Pro His lie Ala Pro His His Trp Arg 

695 



Cys Asp Leu Ser Leu Thr Lys 
700 



Pro Trp Leu Leu Asr Val Met Asn Asn Asn Ar, Gly Ala Hxs His Thr 

715 T ^ 0 

15 Leu His TVr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Lau Gin 



735 



20 



L.U Thr Lys Ala Cly Lys ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 



745 



750 



His Leu Leu Trp Tyr Thr clu lie Gin Asp Glu He ser Gly Asn Arg 

7 o 5 



25 



Leu Thr ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys clu 

"5 



Ar, Glu Phe Arg Cly Phe Gly cys He Lys Gin Thr Asp Thr Thr 

790 

30 Phe ser His Cly Thr Ala Pro clu Gin Ala Ala Pro Ser Leu Ser II 

810 flic 



Thr 
800 



35 



ser Trp Phe Ala Thr Gly „ec Asp Glu Val Asp Ser Gin Leu Ala Thr 



825 



830 



Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe clu Thr Arg 

40 "^^ «5 l^l T'"^ ASP Gin Ala Phe Thr Pro 

855 860 

Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly cln Leu 

875 

45 Leu Arg Thr Glu Leu Tyr Gly Leu Asp cly Thr Asp Lys Gin Thr Val 



890 



895 



SO 



Pro ryr Thr Val s.r clu Ser Arg Tyr Gin Val Arg Ser He Pro Val 

905 siO 



Asn Lys Glu Thr Giu Leu ser Ala Trp Val Thr Ala He Glu Asn 

920 925 



Arg 



55 



3er Tyr His Tyr Giu Arg il 
930 



60 



e He Thr Asp Pro Gin Phe Ser Gin Ser 

940 

ne Lys Leu Gin His Asp He Phe Cly cln Ser Leu Gin Ser Val Asp 

"° 555 



He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro T-/r Pro 



970 



Pro 



975 



65 



Thr Leu Pro Clu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 



935 



Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp Hi 
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990 

s His Leu Thr Asp 



VfVf 9/M tntJS 



7 J-: 



1000 iQ^C 



Cly Glu^A.n Trp Arg Leu gi-.^l^u Pro Asn Ala Cln Arg Ar, Asp v.i 
Tyr Thr Tyr Asp Arg Ser Lys lU Pro Thr Glu Gly lie Ser Leu clu 

10 '-■'^^ tSs''''' tJso-'*^ rj^5^^^ 

Tyr Leu cly cin^cin Gin Thr Phe Tyr Jhr Ala Gly oin AU Glu Val 
15 Thr Leu Glu^Lys Pro Thr Leu Gln^Ala Leu Val Ala Phe Cln Glu 



Thr 

1085 



Ala Met Mec Asp Asp Thr Ser Leu cln Ala Tyr Clu Gly Val lie Glu 
20 1100 

Giu^Gln Clu Leu Asn Thr Ala Leu Thr cln Ala cly Tyr cln Gin v.l 

25 ^"^^ ^""^ tf^J*"' ''''' ^^"^ ^'^^ Trp Ala Ala Arg 

Cln Cly TVr Thr Asp Tyr cly Asp Ala Ala Cln Phe Trp Arg Pro Cln 

1145 

30 Ala cm Arg^Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 

1165 

Thr His His Cys val He lie Cln Thr Cln Asp Ala Ala Gly Leu Thr 
35 11^5 

Thr^Gln Ala His lyr Ajp^iyr Arg Phe Leu Thr^Pro Val cln Leu Thr 

40 "^"^ Thr Leu ASP Ala Leu Gly Arg 

1205 1210 i215 

Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin AU Ala Gly 

1220 1225 ' 



1230 



45 Tvr ser Asn cln Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 

1240 X245 

Ala Leu Thr Gly Ala Leu Pro Val Ala Cln Cys Leu Val Tyr Ala Val 
50 ^255 1260 

Asp ser Trp Mec Pro Ser Leu Ser Leu Ser Gin Leu Ser cln Ser Gin 

1275 1230 

55 ''^^ T'^P Ala Gin Leu Arg Ala Ala His Mec He 

1285 1290 1295 

Thr Glu ASP Gly Lys Val Cys Ala Leu Ser Cly Lys Arg Gly Thr Ser 

1305 131Q 

«) His Gin Asn^Leu Thr He Cln Leu He Ser Leu Leu Ala Ser He Pro 



1320 1325 

Ser 



Arg Leu Pro Pro His Val Leu cly He Thr Thr Asp Arg Tyr Asp 
65 l-*^' 1340 

Asp Pro Cln Cln cln His Cln Cln Thr Val Ser Phe Ser Asp Gly Phe 
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5 



1350 ^355 
-ly Ar, Leu Leu Cin^5« 3« Ala Arg His^ciu Ser Gly Asp Ai. Ttp 



1j"5 



20 



cm Ara Lys Glu Asp cly Gly Leu Val Vai Asp Ala Asn Gly Val L*u 

1385 139Q 

10 \\U^'° tlL'"'^'^ ^'^^ ^« •=^'^9 Thr 3iu 

^'''^ 1400 1405 

Tyr Asp Asp Lys Gly Gin Pro Val Ar, Thr Tyr Gin Pro Tyv Phe l.u 

1415 1420 

15 Asn ASP Trp Arg TVr Val Ser Asp Asp Ser Ala Arg Asp Asp L*u Ph. 

i435 ^^^^ 

Ala ASP Thr His Leu Tyr Asp Pro Leu Gly Ar, Glu Tyr Lys Val He 

1**5 1450 

Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe He 

1465 1470 

Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro • 
"75 1480 1435 

(2) INFORMATION FOR SEQ ID NO: 33: 
,^ <il SEQUENCE CHARACTERISTICS: 

LBTOTH: 3288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

'ii> MOLECULE TYPE: DNA (genomic 



) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

^ M« vl? ?hr 511 2« «^ t:c .3 

Mec vai Thr Vai Mac Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 

5 10 15 

as ^ ^'^^ TAT C.\A AAC OTA TTT GAT ATC G-:v 9,; 

45 Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val AsJ lu 

20 25 30 

Sr xTf ^ ^^"^ '^'^ '^<^ GTT CCC ACC CTC CCC G-T 144 

ser He Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro vH 
-^^ 40 45 

^ St^ "^^"^ <^AA CGT GCG GAA .AAT IS^ 

, Glu Ala His Thr Val iVr Arg Gin Ala Arg Cln Arg Ala cTu Asn 

55 " 60 



CTG AAA TCC CTC TAC CGA GCC TGC CAA TTG CGT GAG GAG CCG CTT -'T 5jn 
Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg cm Glu Pro 511 ul 

^ 30 

fiO AAA GCC CTC GCt AAA CTT AAC CTA CAA TCC AAC GTT TCT CTC CTT - • ^« 
Lys Cly Leu Ala Lys Leu Asn L.u Gin S.r Asn vH S vl^ g^* 

35 90 35 

GAT GCT TTG GTA GAG AAT ATT GGC GGT GAT GGG GAT TTC ACC CAT 5 - • 

W ASP Ala Leu Val Glu Asn lie Gly Gly Asp Gly aJ JJ^ ser 2J 
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100 



105 



lie 



ATG AAC CGT GCC ACT CAA TAT GCT GAC OCT GCC TCT ATT CAA T-'C C-1 
Mac Asn Arg Aia S*r Gin Tyr AU Asp Ala Ala Ser He Gin slc Leu 
•'^^ 120 125 



334 



10 



TTT TCA CCG CGC CGT TAT GCT TCC CCA CTC TAC AGA CTT GCT AAA G^T 432 
fhe .er Pro Gl>- Ar, T-/r Ala Ser Ala Leu T-/r Arg Val Ala L^-s Asp 



15 



l'^ wtl l"""^ f*"^ ^""^ °AT AAT CGC CGC GCT G^^T 

Lou HIS Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg 111 .^sj 

155 

CTC AAG GAT CTG ATA TTA AGC CAA ACG ACQ ATC AAT AAA GAG GTC ACT 
Leu Lys Asp Leu lie Leu Ser Clu Thr Thr Met Asn Lys Glu vll ?S 

l"' 170 



•130 



523 



TCC CTT GAT ATC TTC TTG GAT GTC CTA CAA AAA GGC GGT AAA GAT ATT 57 ^ 
20 ser Leu Asp lie Leu Leu Asp Val Leu Gin Lys Gly S t^s AsJ i7. ^ " 

185 190 

ACT GAG CTC TCC GGC GCA TTC TTC CCA ATC ACG TTA CCT TAT GAC GAT 

25 ^""^ f "« Thr U: Pro Asp 

l'^ If? °r ^^^A ACG CTC 672 

HIS Leu ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 

30 220 

J^n S SIJ ?S HI th"" r° iS* ^'^^ 720 
Asn Gly val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 

230 235 240 

gJJ cti sSr SI TH° f ^A.:. CAT 768 

Giu cm Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin .-vsp 

2«5 250 255 

40 Jin G?; lil Jh* IT IT ^^"^ TTT TAT TTC AAA CCG 816 

-ly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 

260 265 270 

5Il ?iv 21 cf^ ■"'^ °" 'fA^ <^TG TCA CAC TAC ACC AGC 364 

val Gly Phe Ser Gly Gin Pro Mec Val Tyr Leu Ser Gin Tyr Thr Ser 

^'^ 280 285 

Sn S in vl? ^ f?* ™ "^'^A CCA GAC CAA 912 

ciy Asn Gly He val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 

295 300 

aS Til ?f * rlt ^ '^'^ ^ AAA CTC ACT TCG TCA ATC GCA 
3 of 1 Ala Pro Leu Lys Leu Thr Trp Ser Mec AU 

315 320 

tv^ G?n J^I ^r"^ I''-^ ^""^ ACA ACG ATC GGA GAC 

L/s Gin cys ryr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 

325 330 335 

60 ^y Jin 511 [IS inr ^r*" ^^'^ *<=^ ^« ACT .^C 1056 

^ly Asn Val Leu Thr Gly cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 

345 350 

Pro A« 't2l r7 IT ^'-'^ ^^"^ AAA TCA CGC ACT 1104 

Pro «sp Ly. Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 

360 365 



50 



960 



55 



lOOd 
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ACT CAG CCT TTG CCA AGC TTC CAT CTG CCG GTC ACA CTC GAA CAC AGC 1152 
Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 
370 375 

5 GAG AAT AAA GAT CAG TAC TAT CTG AAA ACA GAG CAG GGT TAT ATC ACG 1200 

Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr lie Thr 

390 395 400 

GTA GAT ACT TCC GGA CAG TCA AAT TGG AAA AAC GCG CTG GTT ATC AAT 124S 
Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 

405 410 415 

GGG ACA AAA GAC AAG GGG CTG TTA TTA ACC TTT TGC AGC GAT AGC TCA 1296 

Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 
^ 420 425 430 

GGC ACT CCG ACA AAC CCT GAT GAT GTG ATT CCT CCC GCT ATC AAT GAT 1344 
Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala He Asn Asd 
435 440 445 

ATT CCA TCG CCG CCA GCC CGC GAA ACA CTG TCA CTC ACG CCG GTC AGT 1392 
He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 
450 455 

TAT CAA TTC ATC ACC AAT CCG GCA CCG ACA GAA GAT GAT ATT ACC AAC 1440 
Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn 

470 475 480 

CAT TAT GGT TTT AAC GGC GCT AGC TTA CGG GCT TCT CCA TTG TCA ACC 1488 

His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 

485 490 495 

AGC GAG TTC ACC AGC AAA CTC AAT TCT ATC GAT ACT TTC TCT GAG AAG 1536 
Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 

500 505 510 

ACC CGG TTA AGC TTC AAT CAG TTA ATC GAT TTC ACC GCT CAG CAA TCT 1584 
Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 
515 520 525 

TAC AGT CAA AGC AGC ATT GAT GCG AAA GCA GCC AGC CGC TAT GTT CGT 1632 
Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 
530 535 540 

TTT GGG GAA ACC ACC CCA ACC CGC GTC AAT GTC TAC GGT GCC GCT TAT 1680 

Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 
545 550 555 560 

CTC AAC AGC ACA CTC GCA GAC GCG GCT GAT GGT CAA TAT CTC TGG ATT 1728 
Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 

565 570 575 

CAG ACT GAT GGC AAG AGC CTA AAT TTC ACT GAC GAT ACG GTA GTC GCC 1776 
Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 

580 585 590 

TTA GCC GGT CCC GCT GAA AAG CTC GTA CGT TTA TCA TCC CAG ACC GGG 1824 
Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 
595 600 605 

CTA TCA TTT GAA GAA TTC GAC TGG CTC ATT GCC AAT GCC AGT CGT AGT 1872 
Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg ser 
610 615 620 

GTG CCG GAC CAC CAC GAC AAA ATT GTC CTC GAT AAG CCG GTC CTT GAA 1920 
Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 



63-3 635 S40 

GCA CTC GCA GAG TAT GTC AGC TTA .-.i-A CAC CGC TAT GGG CTT --^A*^ GC" i --3 
Ala Leu Ala clu T/r Vai ier Leu Lys Gin Arg Tyr Giy Leu Asp Ala 
^ 645 650 655 

a.a.T ACC TTT GCG ACC TTC ATT AGT GCA GTA AAT CCT TAT ACG CCA GAT >0i6 
Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 

660 665 670 



lU 



IS 



20 



30 



35 



40 



45 



50 



55 



CAG ACA CCC AGT TTC TAT GAA ACC CCT TTC CGC TCT CCC GAC GCT AAT 

Gin Thr Pro ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 
675 680 685 

CAT GTC ATT GCG CTA GGT ACA GAG GTG AAA TAT GCA GAA AAT GAG CAG 2112 

Iti '^^^ Tyr Ala Glu Asn Glu Gin 

650 695 700 

GAT GAG TTA GCC GCC ATA TGC TGC AAA GCA TTG GGT GTC ACC AGT GAT 2160 
Asp Clu Leu Ala Ala He Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 

710 715 720 

GAA CTG CTC CGT ATT GGT CGC TAT TGC TTC GGT AAT GCA GCC AGT TTT 2208 

^ Glu Leu Leu Arg lie Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 
--5 725 730 735 

ACC TTG GAT GAA TAT ACC GCC AGT CAG TTG TAT CGC TTC GGC GCC ATT 2256 
Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala He 

^^40 745 75^j 

CCC CGT TTC TTT GGG CTC ACA TTT CCC CAA GCC GAA ATT TTA TGG CGT 2304 
Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu He Leu Trp Arg 
755 760 765 

m!? ^ ^ ^^"^ ^'^^ "A CAG TTA GGT CAG GCA 2352 

Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 

775 780 

AAA TCC CTG CAA CCA CTG GCT ATT TTA CGC CGT ACC GAG CAG GTC CTG 2400 

Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 

790 795 800 

GAT TGG ATG TCC TCC GTA AAT CTA AGT CTC ACT TAT CTC CAA GGG ATC 2448 

Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met 

805 810 815 

GTA AGT ACG CAA TGG AGC GGT ACC GCC ACC GCT GAG ATC TTC AAT TTC 2496 
val Ser Thr cln Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 

820 825 830 

TTC GAA AAC CTT TCT GAC AGC GTC AAT AGT CAA GCT GCC ACT AAA GAA 2544 
Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 

840 845 

ACA ATC GAT TCG GCG TTA CAG CAG AAA GTC CTC CGG GCG CTA AGC GCC 259^ 
Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 
850 855 860 

GGT TTC GGC ATT AAG AGC AAT GTC ATC GGT ATC GTC ACC TTC TGG CTC 2640 

Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 

370 875 880 

GAG AAA ATC ACA ATC GGT AGT GAT AAT CCT TTT ACA TTC GCA AAC TAC 2638 
Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 

885 890 395 



064 



60 



65 
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-rn hII '^■^ TTT AGC CAT GAC X^T G^C A-G TT^ --C -• • 

-rp HIS „sp cm Thr L«u Phe ser His Asp Asn Aia ^^ir cT^ " " 



TCC TTA CAA ACC GAC ACT TCT CTC GTA kTT err ■^<~-r> 

3er Leu Gin Thr Asp Thr Ser Leu vll i7e Til rll ctn nt^ f" ""^^ ''"^ 
9i5 Q,„ Tnr Gin Gin Leu Ser 



920 



.0 5S S S JJJ - - jcc .cc o« e„ ... 



15 



c" ™ 2s ?s s 21 tr 

345 -^^^ Asn Gly He xhr Asn 



950 Qcc 



20 "0 

CAG TGG CAA ACT CAA GTC ACC CTT TCC CC.T riT 

=1« Trp o:u TJj Cl„ v.. Thr vH SI Z S SI? S 

985 

TTC CAT CAA TTA AAT GCC .VAT GAT ATC Arr Knm .... 

.H. o>„ ^ ».„ ™ ?s ?s §K s,^ 
30 ?s - - - ^ ?s s 



25 



45 



50 



60 



1020 



'SI JSr zis s gJ; jji r ^ ^ '^^c ^1=0 

35 1025 ?030 ^'"^ Thr ser 

1035 1040 

40 "50 1055 

s,^ s £s s s s.- - - s 

X 0 6 5 



m: - s - - ™ - 



1080 1085 



GCC GCA ATC AGC AAT COT CAG TAA 

Ala Ala He Ser Asn Arg Gin ... 
1090 1095 



3285 



(2) INFORMATION FOR SEQ ID NO:34: 
'i' SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1095 amino acids 

(B) TYPE: amino acids 

(C) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



'XI) SEQUENCE DESCRIPTION: SEQ ID NO-34- 

Features From To Description ' 

65 it* 267 SEQ ID NO: 15 

254 492 TcaAii peptide 
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M6C val Thr Vai Mec Gin Asn Lys lU Ser Phe Lau Sir Cly Thr s*r 
5 ^ IS 

Glu Gin Pro Leu Leu Asp Al* Cly T/r Gin Asn Val Phe Asp lie 

S« lie ser Ar« Al* Thr Ph. Val Gin S« v^l Pro Thr Leu Pro v^i 

40 45 

Lys Glu Ala His Thr Val Vyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 

55 60 

15 Leu Lys Ser Leu TVr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val He 

' ^ 80 

Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 
20 '° S5 

ASP Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Phe Ser Asp Leu 



110 



25 u? ^-^^ "^"^ Ala Ala Ser He Gin ser Leu 

120 

Phe ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 

135 

30 Leu His Lys Ser Asp S.r Ser Leu His Asp Asn Arg Arg Ala Asp 

Leu Lys Asp Leu lie Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 
35 " "° 175 

ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He 

185 

^ Thr Glu Leu Ser Gly Ala Phe Ph. Pro Mec Thr Leu Pro Tyr Asp Asp 



205 



His Leu ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 

215 220 

45 Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 

"° 235 240 

Glu Gin Thr ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 
50 250 255 

Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 

265 270 

55 ^•'^ t?« val Tyr Leu Ser Gin r,r Thr Ser 



280 



285 



Cly Asn Gly U. Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 

300 

60 Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 

310 

Lys Gin Cys Tyr lyr Leu Val Ala Pro Asp Gly Thr Thr Met Cly Asp 
65 "° 335 

Gly Asn val Leu Thr Gly cys Phe Leu Arg Gly Asn ser Pro Thr Asn 
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340 



345 



350 



Pro Asp Lys Asp Gly lie Phe Aia Gin Val Ala Asn Lys Ser Cly Ser 
355 360 365 

Thr Gin Pro Leu Pro Ser Phe His Leu Pro VaI Thr Leu Giu His Ser 
370 375 330 



lU 



IS 



20 



Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Giy T/r lie Thr 

335 390 395 400 

Vai Asp Ser Ser Giy Gin Ser Asn Trp Lys Asn Aia Leu Vai lie Asn 

405 4X0 415 

Giy Thr Lys Asp Lys Giy Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 

420 425 430 

Cly Thr Pro Thr Asn Pro Asp Asp Val lie Pro Pro Ala lie Asn Asp 

435 440 445 



He Pro 
450 



Pro Pro Aia Arg Giu Thr Leu 

455 



Leu Thr Pro Val Ser 
460 



25 



30 



35 



Tyr Gin Leu Mec Thr Asn Pro Aia Pro Thr Giu Asp Asp lie Thr Asn 
465 470 475 430 

His Tyr Giy Phe Asn Giy Aia Ser Leu Arg Aia Ser Pro Leu Ser Thr 

435 490 W4 » 495 

Ser Giu Leu Thr Ser Lys Leu Asn Ser lie Asp Thr Phe cys Giu Lys 

500 505 510 

Thr Arg Leu Ser Phe Asn Gin Leu Mec Asp -Leu Thr Aia Gin Gin Ser 
515 520 525 

Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 
530 535 540 



Phe Giy Glu Thr Thr Pro Thr Arg Val Asn Vai Tyr Giy Aia Aia Tyr 
40 545 550 555 560 



45 



Leu Asn Ser Thr Leu Aia Asp Aia Aia Asp Giy Gin Tyr Leu Trp lie 

565 570 575 

Gin Thr Asp Giy Lys Ser Leu Asn Phe Thr Asp Asp Thr Vai Vai Aia 

580 585 590 



50 



55 



60 



Leu Ala Giy Arg Aia Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Giy 
595 600 605 

Leu Ser Phe Giu Glu Leu Asp Trp Leu He Ala Asn Aia Ser Arg Ser 
610 615 620 

Val Pro Asp His His Asp Lys lie Vai Leu Asp Lys Pro Vai Leu Glu 
625 630 635 640 

Aia Leu Aia Giu Tyr Vai Ser Leu Lys Gin Arg Tyr Giy Leu Asp Ala 

645 650 655 

Asn Thr Phe Ala Thr Phe lie Ser Aia Val Asn Pro Tyr Thr Pro Asp 

660 665 670 



65 



Gin Thr Pro ser Phe Tyr Glu Thr Aia Phe Arg Ser Ala Asp Giy Asn 
675 680 685 

His Vai He Aia Leu Giy Thr Giu Val Lys T/t Ala Giu Asn Giu Gin 
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650 o>5 



700 



10 



Asp Giu Leu Al3 Aid lie Cys Cys Lys Ala Leu Gly Val Thr ier Asp 

Glu L«u Leu Arg lie Gly Arg Tyr Cys Phe Gly Asn Ala ciy sar Phe 

■^25 730 

Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu V/r Arg Phe Oly Ala He 

745 

Pro Arg Leu Phe Cly Leu Thr Phe Ala Gin Ala Glu lie Leu Trp Arg 
•55 760 765 



15 Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 

775 780 



20 



Lys ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 

795 300 

Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu cln Gly Met 

a05 810 315 

Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 

820 825 330 

Leu Glu Asn Val cys Asp Ser Val Asn Ser cln Ala Ala Thr Lys Glu 

840 845 

30 Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 

855 860 

Gly Phe Gly He Lys Ser Asn Val Mat Gly He Val Thr Phe Trp Leu 
35 875 380 

Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 

890 395 

Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 

905 910 

Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr Gin Gin Leu Ser 
915 920 925 

45 cm Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 

935 940 

Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 



40 



50 



950 955 



960 



Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 

965 970 975 

55 "^'^ T^^ Gin Val Thr val ser Arg Asp Glu Ala Met Arg cys 

980 985 990 

Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 
995 1000 1005 

^ '^'^ '^^P Thr Gly Ala Cln Val 

^"1" 1015 1020 

Asn Thr Leu Leu Leu Cly Glu Asn Asn Trp Pro Lys Ser Phe Thr S 
^^^^ 1030 1035 1 

Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 
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1045 1050 



105! 



Gly ser Thr Thr L*u cly Asn L«u Lsu Ser Met Mec Gin Ala ksc sro 
J iO'O 1065 10-0 " 

Aid Ala Glu ser Ser Ala Leu Leu AU Ser Val Ala Gin Asn Leu Ser 
lO'S 1080 loas 

Ala Ala lie Ser Asn Arg Gin ••• 
10 1090 1095 



35 



«2) INFORMATION FOR SEQ ID NO: 35 

'i> SEQUENCE CHARACTERISTICS: 

'A) LENGTH: 603 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

<ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

^ Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr 
^ 5 10 15 

Phe cys Glu Lys Thr Arg Leu ser Phe Asn Gin Leu Mec Asp Leu Thr 

20 25 30 

30 Ala Gin Gin Ser Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser 

35 40 45 

Arg Tyr Val Arg Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val T/r 

Gly Ala Ala Tyr Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Cly cin 

«>5 . . 70 75 35 

Tyr Leu Trp lie Gin Thr Asp cly Lys Ser Leu Asn Phe Thr Asp Asp 

85 90 95 

Thr Val Val Ala Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser 

100 105 no 

45 Ser Cln Thr Gly Leu Ser Phe Glu Glu Lau Asp Trp Leu He Ala Asn 

115 120 125 

Ala Ser Arg Ser Val Pro Asp His His Asp Lys He Val Leu Asp Lys 
5Q 135 140 

Pro Val Leu Glu Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg T/r 

150 155 150 

Gly Leu Asp Ala Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro 

165 170 175 

Tyr Thr Pro Asp cln Thr Pro Ser Phe Tyr Glu Thr Ala Phe Ar« Ser 

180 185 190 

AO Ala Asp. Gly Asn His Val He Ala Leu Cly Thr Glu Val Lys T/r \la 

195 500 -.ne 



40 



55 



A5 



200 205 

Glu Asn Clu Gin Asp Glu Leu Ala Ala He cys cys Lys Ala Leu Gly 
210 215 220 
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v^l Thr ier Asp Clu Leu Ltju Arg lie Giy Arg T/r Cys Pha Gly ^sn 

J30 *240 

AlA Giy Arg Phe Thr Leu Asp ciu Tyr Thr AIa Ser Gin Leu T/x Arg 



15 



20 



25 



30 



^^o 255 

Phe Giy Ala lie Pro Arg Leu Phe Giy Leu Thr Phe Ala Gin Aid Giu 

-oO 265 270 

10 lie Leu Trp Arg Leu Met Giu Giy Giy Lys Asp lie Leu Leu Gin Gin 

275 230 285 

:<xx Giy Gin AiA Lys Ser Leu Gin Pro Leu Ala lie Leu Arg Arg Thr 
290 295 300 

Giu Gin Vdl Leu Asp Trp Met ser Pro Val Asn Leu Ser Leu Thr Tvr 

310 315 320 

Leu Gin Giy Met Val Ser Thr Gin Trp Ser Giy Thr Ala Thr Ala Giu 

325 330 335 

Mec Phe Asn Phe Leu Giu Asn Val cys Asp Ser Val Asn Ser Gin Ala 

340 345 350 

:Ocx Thr Lys Giu Thr Mec Asp Ser Ala Leu Gin Gin Lys Val Leu Ara 
355 360 365 

Ala Leu Ser Ala Giy Phe Giy lie Lys Ser Asn Val Mec Giy lie Val 
370 375 

Thr Phe Trp Leu Giu Lys lie Thr He Giy Arg Asp Asn Pro Phe Thr 

390 395 400 

Leu Ala Asn Tyr Trp His Asp lie Gin Thr Leu Phe Ser His Asp Asn 

405 410 415 

Ala Thr Leu Giu Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr 

420 425 430 

40 Gin Gin Leu Ser Gin Leu Val Leu He Val Lys Trp Val Ser Leu Thr 

435 440 

Giu Gin Asp Leu Gin Leu Leu Thr Thr Tyr Pro Giu Arg Leu He Asn 
450 455 

Giy He Thr Asn Val Pro Val Pro Asn Pro Giu Leu Leu Leu Thr Leu 

470 475 480 

Ser Arg Phe Lys Gin Trp Giu Thr Gin Val Thr Val Ser Arg Asp Giu 

485 490 495 

Ala Mec Arg Cys Phe Asp Gin Leu Asn Ala Asn Asp Mec Thr Thr Giu 

500 505 510 

Asn Ala Giy Ser Leu He Ala Thr Leu Tyr Giu Mec Asp Lys Giy Thr 
515 520 525 

530 '"^^^ — ^^'^ '^^^ '^^^ 



35 



45 



50 



55 



60 



65 



535 



540 



Ser Phe Thr Ser Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Giy Gin 

550 555 560 

Arg Leu Asn Val Giy Ser Thr Thr Leu Giy Asn Leu Leu Ser Mec Mec 

565 570 575 
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Oin Ala Asp Pro Ala Ala ciu 

sao 

Gin Asn Leu ser Ala Ala Il« 
5 595 



Sar S«r Ala L«u L«u Ala Ser Vai Aia 
585 

s«r Asn Arg Gin * 

600 



10 



IS 



20 



25 



30 



35 



40 



45 



<2) INFORMATION FOR SEQ ID NO: 36: 
«i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2557 base pairs 
«Bl TYPE: nucleic acid 
(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

CAATTCGCCT TCCGTTTAAT ATTGATGATG TCTCCCTCTT CCGCCTCCTT AAAATTACCC 
ACCATGATAA TAAAGATGGA AAAATTAAAA ATAACCTAAA CAATCTTTCC AATTTATATA 
TTOGAAAATT ACTCGCAGAT ATTCATCAAT TAACCATTCA TCAACTCGAT TTATT^CTGA 
TTCCCCTAGC TCAAGGAAAA ACTAATTTAT CCOCTATXTAG TCATAACCAA TrCGCTACCC 
TGATCACAAA ACTCAATACT ATTACCAGCT GGCTACATAC ACACAAOTGC AGTCTATTCC 
AGCTATTTAT CATGACCTCC ACCAGCTATA ACAAAACCCT AACGCCTXyU ATT.^AGAATT 
TGCTCGATAC CGTCTACCAC GCTTTACAAC CTTTTCATAA AGACAAACCA GATTTCCTAC 
ATGTCATGGC GCCCTATATT GCGGCCACCT TGCAATTATC ATCGGAAAAT GTCGCCCACT 
CGCTACTCCT TTGGGCAGAT AAGTTACACC CCGGCCACGC CGCAATGACA GCACAGGGAN 
TCTGCCACTG CTTGAATACT AAGTATACGC CGGGTTCATC GGAAGCCCTA GAAACCCAGG 
AACATATCCT TCACTATTOT CAGGCTCTCG CACAATTOGA AATCGTTTAC CATTCCACCC 
GCATCAACGA AAACGCCTTC CCTCTATTTO TGACAAAACC AOACATCTTT GGCGCTCCAA 
CTGGACCACC GCCCGCGCAT GATOCCCTTT CACTCATTAT GCTCACACGT TTTCCGGATT 
GGGTGAACGC ACTAGGCGAA AAAOCOTCCT CGGTCCTAGC GGCATTTCAA GCTAACTCGT 
TAACGGCAGA ACAACTGGCT GATCCCATGA ATCTTCATCC TAATTPOCTO TTCCAAGCCA 
GTATTCAACC ACAAAATCAT CAACATCTTC CCCCAGTAAC TCCAGAAAAT GCGTTCTCCT 
GTTGGACATC TATCAATACT ATCCTGCAAT GGOTTAATCT CGCACAACAA TTCAAATCTC 
GCCCCACACG GCGTTTCCCC TTTGGTCGGG CTCGATTATA TTCAATCAAT GAAAGAGACA 
CCCACCTATG CCCAGTGGCA AAACGCGGCA GGCGTATTAA CCGCCGGGTT GAATTC^ACA 
ACAGGCTAAT ACATTACAAC GCTTTPCTGC ATGAATCTCG CAGTCCCGCA TTAAGCACCT 
ACTATATCCG TCAAGTCGCC AAGGCAGCGG CGGCTATTAA AAGCCGTCAT GACTTOTATC 
.\ATACTTACT GATTGATAAT CAGGTTTCTG CGGCAATAAA AACCACCCGG ATCGCCGAAG 
CCATTGCCAG TATTCAACTG TACGTCAACC GGCCATTCGA AAATCTGGAA GAAAATCCCA 
ATTCGGGGCT TATCAGCCGC CAATTCTTTA TCGACTCGGA CAAATACAAT AAACCCTACA 
GCACTTGGGC GGGTGTTTCT CAATTAGTTT ACTACCCGGA AAACTATATT GATCCGACC^ 1500 
TGCGTATCGC ACAAACCAAA ATGATGGACG CATTACTCCA ATCCGTCAGC CAAAGCCA^T 1560 
T.VUCGCCGA TACCGTCGAA GATGCCTTTA TGTCTTATCT GACATCGTIT GAACAAGTCG 1620 
CTAATCTTAA AGTTATTAGC GCATATCACG ATAATATTAA TAACGATCAA GGGCTCACCT 1630 
ATTTTATCGG ACTCAGTOAA ACTGATCCCG GTCAATATTA TTCGCGCAGT GTCGATCACA 1-40 
GTAAATTCAA CGACCGTAAA TTCGCGCCTA ATCCCTCGAG TGAATX^CCAT AAAAnCATT 1300 
GTCCAATTAA CCCTTATAAA AGCACTATCC GTCCAGTGAT ATATAAATCC CGCCTCTATC 1360 



60 
120 
130 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
340 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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TGCTCTOGTT GCAACAAAAG GACATCACCA AACAGACAGC AAATACTAAA GATCGCTATC 
AA.nCTCAAAC GGATTATCGT TATGAACTAA A.:iTTGCCCCA TATCCGCTAT GATCCCACTT 
GCAATACCCC AATCACCTTT GATCTCAATA AAAAAATATC CGAGCTAAAA CTCGAAAAAA 
ATAGACCCCC CGGACTCTAT TCTCCCGGTT ATCAACCTCA ACATACGTTG CTCGTGATCT 
TTTAT.>^CCA ACAAGACACA CTAGATACTT ATAAAAACCC TTCAATCCAA GGACTATATA 
TCTTTCCTCA TATCGCATCC AAAGATATCA CCCCAGAACA GAGCAATCTT TATCCGGATA 
ATACCTATCA ACAATTTGAT ACCAATAATC TCAGAACAGT CAATAACCCC TATGC=.GAGC 
ATTATGAGAT TCCTTCTTCG GTAACTAGCC GTAAAGACTA TCGTTCGGCA CATTATrACC 
TCAGCATGGT ATATAACGGA CATATTCCAA CTATCAATTA CAA.»^GCCGCA TCAAGTCATT 
TAAAAATTTA TATTTCACCA AAATTAACAA TTATTCATAA TCCATATCAA GCAC^GAAGC 
GCAATCAATG CAATTTOATG AATAAATATG GCAAACTAGG TCATAAAriT ATTOTCTATA 
CCAGCCTGGG CGTTAATCCG AATAATAACC CCAATTC 



iS2: 

2040 

2100 

2leO 

2220 

2280 

2340 

2400 

2460 

2520 

2557 



15 (2) INFORMATION FOR SEQ ID NO; 37: 

<i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 845 amino acids 

(B) TYPE: amino acids 

( C ) TOPOLOGY : 1 inear 



20 



(ii) MOLECULE TYPE: protein (partial) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



25 



Aid Phe Asn He Asp Asp Val Ser Leu Phe Mg Leu Leu Lys He Thr 
Asp His Asp Asn Lys Asp ciy Lys lie Lys Asn Asn Leu Lys Asn Leu 



25 



30 



ser Asn Leu tyr Ue cly Lys Leu Leu Ala Asp He His Cln Leu Thr 

40 45 

35 He ASP Glu Leu Asp Leu Leu Leu He Ala Val Gly Clu Cly Lys Thr 

55 50 

Asn Leu Ser Ala He Ser Asp Lys Cln Leu Ala Thr Leu He Arg Lys 
40 ^° 75 80 

Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp ser Val Phe 

85 90 95 

Gin Leu Phe H. „« Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro 

Glu He Lys Asn Leu Leu Asp Thr Val lyr His Gly Leu Gin Gly Phe 

120 J25 

50 ASP Lys ASP Lys Ala Asp Leu Leu His Val Met Ala Pro lyr He Ala 

135 

Alj Thr Leu Cln Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu 
55 "° 155 150 

Trp Ala ASP Lys Leu Gin Pro Cly Asp Gly Ala Mec Thr Ala Glu Gly 

170 175 " 

Phe Trp ASP Trp Leu Asn Thr Lys Tyr Thr Pro cly Ser Ser Glu Ala 
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i30 185 

Vai Glu Thr Gin Clu His li. val Gin r/r Cys Gin Ala l.*u Aia Gin 
5 * 205 

Leu Glu M.t val Tyr His ser Thr Cly He Asn Glu Asn Aia Phe Arg 

^15 220 

Uu Phe Val Thr Lys Pro Giu Met Phe Giy Ala Ala Thr Giy Ala Ala 



235 



240 



Pro Aia His ASP Ala Leu Ser Leu lie Met Leu Thr Arg Phe Ala Asp 



250 



255 

15 Trp Val Asn Ala Leu Cly Glu Ly. Ala s.r Ser Val Leu Ala Ala Phe 



270 



Glu Ala Asn ser Leu Thr Ala Clu Cln Leu Ala Asp Ala Mec Asn Leu 
20 " 285 

ASP Ala A.n Leu Leu Leu Cln Ala Ser He Cln Ala cln Asn Hi. cln 



300 



25 ?o1 T,l '^^^ JJ« ser cys Trp Thr Ser 

He Asn Thr lie Leu Gin Trp Val Asn Val Ala Cln Gin Leu Lys cys 

^25 330 ' 

30 Ars Pro Thr Giy Arg Phe Arg Phe Cly Arg Ala Cly Leu Tyr Ser He 

345 

Asn Clu Arg Asp Thr Asp Leu cys Pro Val Cly Lys Arg Giy Arg Arg 
35 3" 365 

He Asn Arg Arg Val Clu Phe Asn Asn Arg Leu He His Tyr Asn Ala 

375 380 

40 1st ^'"^ ^11 Thr Tyr Tyr He Arg 

395 

Cln Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 

4X0 

45 Gin Tyr Leu Leu He Asp Asn Cln Val Ser Ala Ala He Lys Thr Thr 

425 430 

Arg He Ala Clu Aia He Ala Ser He Cln Leu Tyr Val Asn Arg Ala 
50 445 

Leu Glu Asn Val clu Glu Asn Ala Asn Ser Giy Val He ser Arg cln 

455 450 

55 let "'"^ "^""^ '^^ '-y* A*^' '^^^ ser Thr Trp Aia 

*'° 475 480 

■ Cly Val ser Gin Leu val Tyr Tyr Pro Giu Asn Tyr He Asp Pro Thr 

485 490 455 

60 Met Arg He Cly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 

500 505 

Ser Gin Ser cln Leu Asn Aia Asp Thr Val Clu Asp Ala Phe Mec Ser 
65 "0 

Tyr Leu Thr Ser Phe Clu Gin Val Ala Asn Leu Lys Val He Ser Ala 
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530 535 540 

Tyr His Asp Asn lie Asn Asn Asp Cln Gly Leu Thr Tyr Phe He Gly 
545 550 555 560 

5 

Leu Ser Glu Thr Asp Ala Gly Glu Tyr T/x Trp Arg Ser Vai Asp His 

565 570 575 

Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 
10 580 585 590 

His Lys He Asp Cys Pro lie Asn Pro Tyr Lys Ser Thr lie Arg Pro 
595 600 605 

IS Val lie Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Giu Gin Lys Glu 
6i0 615 620 



20 



35 



50 



60 



lie Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
625 630 635 640 

Asp Tyr Arg Tyr Glu Leu Lys Leu Aia His lie Arg Tyr Asp Gly Thr 

645 650 655 



Trp Asn Thr Pro lie Thr Phe Asp Val Asn Lys Lys lie Ser Glu Leu 
25 660 665 670 

Lys Leu Giu Lys Asn Arg Aia Pro Gly Leu Tyr Cys Ala Gly T*/r Gin 
675 680 685 

30 Gly Giu Asp Thr Leu Leu Val Mec Phe Tyr Asn Gin Gin Asp Thr Leu 
690 695 700 



Asp Ser Tyr Lys Asn Aia Ser Mec Gin Gly Leu Tyr lie Phe Ala Asp 
705 710 715 720 

Mec Aia Ser Lys Asp Mec Thr Pro Giu Gin Ser Asn Vai Tyr Arg Asp 

725 730 735 



Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Vai Arg Arg Vai Asn Asn 
40 740 745 750 

Arg Tyr Aia Giu Asp Tyr Giu He Pro Ser Ser Vai Ser Ser Arg Lys 
755 760 765 

45 Asp Tyr Gly Trp Giy Asp Tyr Tyr Leu Ser Mec Val Tyr Asn Gly Asp 
770 775 780 



lie Pro Thr lie Asn Tyr Lys Aia Aia Ser Ser Asp Leu Lys He Tyr 
785 790 795 800 

lie Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 

805 810 815 



Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Giy Lys Leu Gly Asp Lys 
55 820 825 830 



Phe He Vai Tyr Thr Ser Leu Gly Vai Asn Pro Asn Asn 
835 840 845 



(2) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 
65 (B) TYPE: amino acid 
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10 



15 



(CJ STRANDNESS: single 
(D) TOPCLCGV: linear 

(ii) MOLECULAR TYPE: procein 
(V) FRAGMENT TYPE: N-cerminal 

(xiJ SEQVZtKZ DESCRIPTION: SEQ ID NO: 38: 

Arg Tyr Ti/r Asn Leu Ser Asp Glu Glu Leu ser Gin Phe He Gly 

^ 10 is 

Lys 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 20 amino acids 
(B) TYPE: amino acid 
iC) STRANDNESS; single 
(D) TOPOLOGY: linear 

25 (ii) MOLECULAR TYPE: protein 

(V) FRAGMENT TYPE: N^terminal 

^0 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala 
^5 10 15 

35 lie Ser Pro Ala Lys 

20 



20 



40 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 40: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 
(D» TOPOLOGY: linear 



(ii) MOLECULAR TYPE: protein 
(V) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Ala Asn Ser Leu Tyr Ala Leu Phe Leu Pro Gin 
^ 5 10 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 
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10 



20 



25 



30 



35 



40 



45 



50 



'B) TYPE; amino acid 
(CJ STRAflDNESS: Single 
(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE; procein 

(V) FRAGMENT TYPE: N-cerminai 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin 
is 10 



15 {2) INFORMATION FOR SEQ ID NO: 42: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

tv) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 

Ala Gly Leu Glu 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

(V) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

lie Arg Glu Asp Tyr Pro Ala Ser Leu cly Lys 
15 10 



55 (2) INFORMATION FOR SEQ ID NO: 44: 



(i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 16 amino acids 
(B> TYPE: amino acid 

(C) STRANDNESS: Single 

(D) TOPOLOGY: linear 
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(ii» MOLECULAR TYPE: protein 

(V) FRAGMENT TYPE: H- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
ASP Asp ser Cly Asp Asp Asp Lys V»i Thr Asn Thr Asp He his 



5 iO i5 



Arg 



15 (2) INFORMATION FOR SEQ ID NO: 45: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULAR TYPE: protein 

FRAGMENT TYPE: N-terminal 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
ASP val xaa cly Ser Glu Lys Ala Asn Glu Lys L.u Lys 



5 XO 



(2) INFORMATION FOR SEQ ID NO; 46: 
•'^ <l> SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 7551 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



40 



45 



(iiJ MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ IDNO:46 (ccdA): 



50 



i^. If. 51? ji: s s - « 

^ i^l ^. ^ !Si ^Hr. ^. '.f. ^. -UT. ^, » 

CGC CAG CAA GTA TCT GAG CAC CTT TCC *mn *Ty-o 

55 Ar, cm cin Val s« clu SJf ^2 T.l S I^r Sts' ^^p^ S 

40 45 

5/r SI ^^'^ AAT COC CTC TAT CAA GCC 192 

nis Asp Ala Cln Gin Ala Gin L/s Asp Asn Arg Leu Tyr Glu Aid 

55 60 

Irl iTl fl^ '^'^ "^CC CAA TTA CAA AAT GCG CTC C^T CTT 240 

Arg Il« Leu Lys Arg Ala Asn Pro Gin Leu Cln Asn Ala vll Sis S 

A5 '° 75 80 



60 
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m: tH fTf CTC ATA CGC TAT A.:VC AAT "i^ n- 26S 

«IA n« L«u «la Pro Asn Ala 5lu Ha cly Tyr Asn X^n Cln Ji^I " 

«5 30 95 

5 AGC GGT AGA OCC ACT CAA TAT <m CCC CCC GOT \CC GTT TCT Trr • tv- , , - 

ser Cly Ar, Ala s« Gin Tyr Val Ala Pre ?S SI IfJ J^J 

105 

m IT^ I^'^ "^^"^ ACT CAA CPT TAT OCT CAA CC* "cc i-t 

10 Ph, s« Pro Ala Ala Tyr Leu Thr Clu Leu Tyr Irl ?li Ma iT, 

X 2 s 

TTA CAC CCA ACT GAC TCC CTT TAT TAT ctt r^T \nn 
L.U HIS Ala sar Asp Sar vIT ?JJ JJS SJJ JSJ 1% 2^ S 

CTC AAA TCA ATG GCG CTC ACT CAG CAA AAT ATT n^T^ ~. 
Leu Lys Sar Mac Ala Lau Sar Cln ^{J JJ? S §S 

20 155 

Th!^ fJC TCT TTG TCC AAT GAG CTC TTA TTG GAA AGC ATT AAA ACT GAA 
Thr Lau Ser Leu Ser Asn Clu Leu Leu Lau Glu S»r Til J?J S cK 

•'••'^ 170 

IfJ ■^•^■^ •^'^ AAA GTC ATC GAA ATC CTC TCC ACT TTC <;7« 

ser Ly« Lau Glu Asn Tyr Thr Lys Val Mac Clu Mac lIS ThJ 

X8 S 



15 



528 



25 



190 



30 ^ 1^ Si s - |j - z SI ?;j stj s 
SJJ ST iK s: ;^ s s o=ts S is; s.^ s - 

215 220 



35 



Pro Si ?n Til s s Ji^ Sii G?^ Ill r '20 

225 " L^^i Ciy He Asn 

40 235 240 

s tis s SI S2 Si s s ?s c°ts K s 

250 255 



45 



265 270 



364 



55 



50 III 111 S5 SI J« Pro ct^ f" "-^ 

275 ''^^ Tyj^ TVr Asn Leu 

285 

SJJ SJi s s 1^ SI K sf j;;! 

ct; ctS SI Jf! f"" ^^"^ ^'^^ <=T^ •^'^ AGC 960 

n ^in Glu Tyr Ser Asn Asn Cln Leu He Thr Pro Val Val Asn Ser 

60 ^" 315 

SI SI ST K ?;i in ^ji ??i s 
SI ii? SI s Si S SI s?i S SI 



iOOd 



65 



350 



7" SI S SI SI JJi HI SI SI SI SI Sf SI SI Ui 



350 365 
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K ?n jj? ut iii 2j m is §ji s sn ??5 s ^ si 

CCT CAA GTC AAT ATA GAA TAC TCC GCA AAT ATr Am t-t.. 

pro Cin vax a.„ cl. .... s.J JfJ Hi iii Ut HI If! '''' 



10 



390 

400 



GAT ATC ACT CAA CCT TTT CAA ATT CGC cvr apa ^r*^ 
ASP lie s« C.„ Pro Pha CIu ?II ITu S S Vrl 



410 



«S 430 



20 



S sj; s ^ s - ^ ^ - c„ 

Jis - 25 S{ Hi S s JiS s »" 

480 



35 s jis ji: ?jj - S - 



525 

P ili §ts Si 

530 ' ^xu XA« Asp Leu Asn Sor Giy Ser Thr Civ 

45 540 ^ 

z s s ;s ^^: s sj si Ji: s 

33U 555 



S S S SI S Z if. SI Jil ^ SI 
« - - K JJI ^iS S S S ;SI Si S £^ 
, S Si SI ?II SJI Sti S S SI S5 SI Si in 

1 « B I 



^i; s 51? §?; |?j SI Hi s: ti: si 

635 



70 



2i StI ?SJ S SI Sit if. Si IS ?r. JI? S 
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650 

, ?s si? J^s Jjj s s ?s If; in - - fis s; 

If St; ^ n: §k z m m jj| S sj s 
£12 gi $n s.'? IS |; ST= - ^ S s n? 



700 



" |K JJI S 12 S m JJ? IS .^tS ^ 

20 s S5 s is; s i' s 2S ?s! 



m JSf §s s sij s §ts sj; sii m 



2304 



35 



25 740 -7^e 

w'H* I:^'^ '^'^ ^^"^ ^ CCA CAA TTO GAA ATG GTT TAC CAT Tcr 

vai Gin TVr Cys Gin Ala Leu Ala Gin Leu mI? vH ??r Hxl Sr 

30 765 

S |?5 tIS !SS Sti JJS SI S; 5f5 

ii? s S5 SI s Si s; S ss s -° 

795 800 

^ - SI? ?s s 2p%- 51? jjs sj JL* ?5 =°ti 

810 815 

AAA GCG TCC TCG GTG CTA GCG GCA TTT GAA GCT AAC TCC TPA ACQ GCa Pdcs 
Lvs Ala Ser Ser Val Leu Ala Ala Phe Clu All JIS S S ?S? S 

825 830 

ct'n lI^ f?I ^'''^ '^'^ GCT AAT TTG CTC TPC C^A 2544 

Glu Gin Leu Ala Asp Ala Mec Asn Leu Asp Ala Asn Leu Leu Leu Cln 

50 845 

GCC ACT ATT CAA GCA CAA AAT CAT CAA CAT CTT CCC CCA CTA ACT rrs. leo, 
Ala ser He Gin Ala Gin Asn His Gin His lZ So So vli ?5J 

855 860 

SJi i^l m2 I^r ^^^"^ ^'^'^ '^'^T '^^'^ CTC CAA TGC 2640 

365 ^•'^ T'^P Thr Ser lie Asn Thr He Leu Cln Trp 

875 880 

60 vIT °TC GCC CCA CAG GGC GTT TCC GCT 2688 

«) /al Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ter Ala 

890 895 

S ^ S aX ?JJ S §tt If* '^^^ ^-^C TAT 2735 

900 ""^ ''^^ '^^'^ Pro Thr Tyr 



55 



65 



'OO 905 910 

GTA TTA ACC CCC CGG nxi AAT TCA 2734 
Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala S iSJ Hr 
70 320 

CAA CAG OCT AAT ACA TTA CAC GCT TTT CTC GAT GAA TCT CCC ACT GCC 23 j 2 
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Gin Gin AU Asn Thr Leu His Aid Phe Lau Asp ciu ser Arg Ser Ai^ 



340 



GCA TTA AGC ACC TAC TAT ATC CGT CAA CTC GCC \AG GC\ G-r r-i -r . • • - 

Al. L.U ST Thr ryr Tyr II. Ar, Cln v.l L,f If. ^ia ll "'^^ 



545 950 



960 



ATT AAA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTC ATT GiT -'T -.o-,, 
II* Lys s« Arg Asp Asp Leu Tyr Gin Tyr LeJ lIS As^ A^J cTn 



970 



975 



20 



CTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC CAA GCC ATT rr- -.r-^ - 
Val s« Ala Ala Il« Lys Thr Thr Arg 11. aS guI Til ?U Sa 

ATT CAA CTC TAC GTC AAC CGG GCA TTG GAA AAT GTC GAA GAA AAT err , ^, , 
lU Gin L.U Tyr Val A« Arg Ala L.u Glu Asn vTi Gil oit Sa 

1000 i005 

AAT TCG GGG GTT ATC AGC CGC CAA TTC TTT ATC GAC TGG rar ai,* , 

Asn ser Gly Val II. s.r Arg Gin Ph. tIS S 

1015 1020 

^ ^^"^ ^^'^ ^'^'^ GGT GTT TCT CAA TTA CTT TAC T»r v. -,n 

25 Asn^Lys Arg Tyr s.r Thr^Trp Ala Gly Val s.r cVj IS SH lyr ??r 

d55 '^CC ATC CGT ATC GGA CAA ACC V4A 4Ty: -a 

Pro Glu Asn Tyr lU^Asp Pro Thr M.c Argyll. cJi 



30 



40 



50 



6(J 



1055 



ATC GAC GCA TTA CTC CAA TCC GTC AGC CAA ACC CAA TT^ AAC GCC GAT 151 - 
M.C ASP Ala L.U L.U Gin S.r V.l Ser Gin S.r J^S Ma A^ 

35 1065 1070 

^ SIf S SI? £SJ S ?I? 



SI "I SI ^ 5ir ss? s; ?jj sjs «i ?r 

1090 inL "^^ Asn Asp 

1095 1100 



« £§s 2S s ^ ^jr. s sji Si st; 

m° U15 ' ii2o 



m ?S vl? A^"" t^'' AAA rrc AAC GAC GCT AAA TTC 3408 

lyr Tyr Trp Arg far Val Asp His Ser Lys Ph. Asn Asp Gly Lys Ph. 

1125 1130 1135 

A?f °CC TOO ACT GAA TGG CAT AAA ATT GAT TCT CCA ATT AAC iass 

Ala Ala Asn Ala Trp s.r Glu Trp His Lys II. Asp Cys Pro lH 

55 * 1145 1150 

l?o ^ ^ iS! ^TS "J"" 979 bl^ -^AT ^ TCC CGC CTC TAT 3504 

Ser , 

ii65 

£IS lI^ ^r.^ 5^° t^^ ACA GGA AAT ACT 3552 

1180 



, ^ '^^^ '^'^A °TG ATA TAT AAA TCC CGC CTG TiT 

Pro Tyr Lys^s.r Thr II. Arg Pro^Val II. Tyr Lys s.r Arg lIS ^-r 

71' 7'" ""^ AAC GAC ATC ACC AAA CAG ACA GGA AAT ACT 

L.U Leujrp L.u Clu Cln Lys^Glu II. Thr Lys Cln s« 



65 



70 



A.^A CAT CGC TAT CAA ACT GAA ACC CAT TAT CGT TAT GAA CT^ AAA rrr ^^nn 

Lys^Asp Gly Tyr Cln Thr^Glu Thr Asp Tyr Arg^T^J ctJ lIu lIS 

GCG CAT ATC CCC TAT CAT GCC ACT TCC AAT ACC CCA ATC ACC TTT GAT 3 6<ia 
Ala Hxs II. Arg Tyr^Asp Cly Thr Trp Asnjhr Pro Til ?Sr S 



1215 
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20 



0 



GTC rJ.r AAA AAA ATA TCC GAG CTA AAA CTG CAA AAA AAT ACA CCG CCC 
Vai Asn Lys Lys lie Ser Giu Leu Lys Leu Giu Lys Asn Arg Ala Pro 

1220 1225 1230 



5 CCA CTC TAT TGT GCC GGT TAT CAA GGT CAA GAT ACG TTC CTG GTC ATC 3744 
Gly Leu Tyr Cys Ala Giy Tyr Gin Giy Giu Asp Thr Leu Leu Vai Met 
1235 1240 1245 

TTT TAT AAC CAA CAA GAC ACA CTA GAT ACT TAT AAA AAC GCT TCA ATC 3*9 *> 
1(J Phe T/r Asn Gin Gin Asp Thr Leu Asp Ser T^'r Lys Asn Ala Ser Mec 
1250 1255 1260 

CAA GGA CTA TAT ATC TTT GCT GAT ATC GCA TCC AAA GAT ATC ACC CCA 3340 

Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Mec Thr Pro 
15 1265 1270 1275 i230 

GAA CAG AGO AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3833 
Giu Gin Ser Asn Vai Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 

1285 1290 1295 

AAT AAT GTC AGA AGA GTC AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 
Asn Asn Val Arg Arg Vai Asn Asn Arg Tyr Ala Giu Asp Tyr Giu He 

1300 1305 1310 

CCT TCC TCG GTA ACT ACC GCT AAA GAC TAT GGT TCG GGA GAT T^vT T^^C 3984 

Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Giy Trp Giy Asp Tyr Tyr 
1315 1320 1325 

CTC AGC ATC GTA TAT AAC GCA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 
Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 
1330 1335 1340 



GCA TCA ACT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 

Ala Ser ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg He He 
5 ^345 1350 .1355 1360 



4080 



CAT AAT GGA TAT GAA GGA CAG AAG CCC AAT CAA TCC AAT CTC ATC AAT 4128 
His Asn Gly Tyr Giu Gly Gin Lys Arg Asn Gin Cys Asn Leu Mec Asn 

1365 1370 1375 

AAA TAT GCC AAA CTA OCT GAT AAA TTT ATT GTT TAT ACT ACC TTC CGG 4176 

Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Giy 

1380 1385 1390 

GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATC TTT TAC CCC GTC TAT 4224 
Vai Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 
1395 1400 1405 

CAA TAT AGC GGA AAC ACC ACT CCA CTC AAT CAA GCC ACA CTA CTA TTC 4272 
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TCC ATT CCT CCA 4320 
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Giu Ala Trp He Pro Civ 
^425 1430 1435 1440 

GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4 368 

Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Giy Asp Asp T/r 

1445 1450 1455 

OCT .VCA GAC TCT CTC AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 
Aid ..ir Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 

1460 1465 1470 

ATC ACT GAC ACT AAA CGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 4464 
Met Thr Asp ser Lys Giy Thr Ala Thr Asp Val Ser Giy Pro Vai Giu 
1475 1480 1485 

ATT K\r ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA CTC ?^\A CCG 4512 
lie A5n Thr Ala He Ser Pro Ala Lys Vai Gin He He Vai Lvs Ala 
1490 1495 1500 
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10 



15 



35 



40 



55 



4603 



G.3T GGC KKC GAG CAA ACT TTT ACC CCA GAT AAA GAT CTC TCC ATT CAC i'.iO 
Gly Gly Lys Giu Gin Thr Phe Thr Aid Asp Lys Asp Val ser lH cTn 
iSOS 1510 1515 ^520 

CCA TCA CCT AGC TTT GAT GAA ATC AAT TAT CAA TTT AAT CCC CTT CAA 
Pro ser Pro sac Phe Asp Clu Met Asn T/r Gin Pha Asn Al* Leu Glu 

A525 1530 1535 

ATA GAC CGT TCT GGT CTG AAT TTT ATT AAC AAC TCA GCC ACT ATT GAT 
lie Asp Gly s«r Cly Leu Asn Phe He Asn Asn Ser Ala Ser lie 

^540 1545 1550 

GTT ACT TTT ACC GCA TTT GCG GAG GAT GCC CGC AAA CTC GGT TAT cai AinA 
val Thr Phe Thr Ala Phe Ala Clu Asp Gly Ara ^ ^ JJJ 
*'55 1560 1555 

AGT TTC AGT ATT CCT GTT ACC CTC AAG CTA ACT ACC CAT AAT CCC CTC 4752 

•>0 fs^o^" T^^c^*" ^" Asp Asn Ala S5 " 

-u 1575 1580 

Thr l'^, m« ot^ ^^"^ "^AT CAA TCC CAA TCC 4800 

Thr Leu His His Asn Giu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 

25 ^590 1595 1600 

TAT CCT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT CCA CGC 48^8 
Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Val Ala Arg 

lo05 1610 16^5 

m1 Thr Thr T?f '^'^ '^^T AAT ATT 4896 

Ala Thr Thr Gly lie Asp Thr He Leu Ser Mec clu Thr Gin Asn lie 

i620 1625 1630 

St'^ 5^° TTA GGC AAA CCT TTC TAT CCT ACG TTC CTC ATA CCT 4944 

Gin Glu Pro Cln Leu Gly Lys Cly Phe Tyr Ala Thr Phe Val lie Pro 
i^jS 1640 1645 

CCC TAT AAC CTA TCA ACT CAT CCT GAT CAA CCT TGG TTT AAG CTT TS.T 4992 
T^cn**" ^'"^ Asp Glu Arg Trp Phe Lys Leu T^t 

•I*'" 1655 1660 

iT^ ^ H^I SIT S*^ ^^"^ ATC TAT TCA GCC CAG 5040 

lie Lys His Val Val Asp Asn Asn Ser His lie lie Tyr Ser Gly Gin 

^5 1670 1675 1680 

i^^^ *T* ACA TTA TTT ATT CCT CTT GAT GAT 5083 

Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp Asp 

16«5 1690 1695 

50 GTC CCA TTG AAT CAA CAT TAT CAC CCC AAG GTT TAT ATC ACC TTC AAC 5136 
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Mec Thr Phe Lys 

1700 1705 1710 

AAA TCA CCA TCA CAT OCT ACC TCC TCC CGC CCT CAC TTT CTT ACA CAT 5134 
Lys ser Pro Ser Asp Cly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
1'15 1720 1725 

ASD tT- S'^t ti'' ^T* '^'''^ AAA TCC ATT TTC ACC CAT rrr 5232 

fiO ^ ^i!.**" Ser He Leu Thr His Phe 

1730 1735 1740 



fi5 



70 



ctn ^11 ST? »^ '^^ '^^ A" ^'^ CCA ATC CAT TTC 5230 

Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Clu Pro Mec Asp Phe 

1750 1755 1760 

s-r r?5 ^'^ e'^'^ 7^ '^"'^ TTC TCG CAA CTC TTC TAC TAT ACC CCC 5323 

s„r Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 

1765 1770 1775 

mI? fl:, STT ?r r° ^'^ CAT CAA CAG AAC TTC CAT GAA GCC 5376 

Mec Leu Val Ala Cln Arg Leu Leu His Clu Gin Asn Phe Asp Glu Ala 
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ei iDcrrn rrr euccr m n c oe\ 



1730 



1785 



179C 



5472 



10 



IS 



20 



25 



3(» 



35 



40 



45 



50 



55 



AAC CGT TCC CTG AAA TAT CTC TCG ACT CC\ Tec rv-r t*,, 

r'£,"' "o 2^ ?v" S K 

S5 £S StS S S5 £E 2S - - 

£S - - ™ - 5W Z Slf 

1840 

s §ts s; £^ ^ isj is? 

s s s s s £SI s ?jj s; - 

S5 £ s IK K?;:? s ii? St; s stj 

1»«0 1885 

Si ss; z St £s s £?g ~ - - 

sj ^5 - - - .cc :r„. c„ c« 

1915 ^920 

^ 5If S 5- S - - c« ccj ccj ™ 

1930 ^535 

- £s Ji; ?s - - - ™ sj; ?n 

^ii s| s jjj - - ™ 

1960 1965 

s s sji s sjs o^ts s - ™ - 
tr. s £sj 5s Si s s sk^ s s 

^^^^ 2000 



5303 



5855 



6048 



60 



6096 



65 



70 



- S £?S ^ S £Si - S 2S 

s s j~ St; £ss s ?s s.'? sn 5 

^025 2030 

ss s £ss K ?s? - s st; - 

2040 2045 

r-^ ^i: s= Hi^s; §k ™ 

ACT AAC CTG AGO ATT CAC CAC AAA ACC \TT r^i r^^. 

v-nu ^^HC AAA ACC ATT CAA GAA TTG GAT GCC 6240 
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L-eu Thr Asn L^u 5er li^ Gin Asp Lvs Thr li^ Jiu Glu Leu Asp Ala 
2055 lO'^O 207= 2030 

GAG a-AA ACG GTG TTG GAA A.AA TCC AAA CCG GGA GCA CAA TCG CCC TTT 6Zo^ 
5 Glu Lys Thr Vai Leu Glu Lys Ser Lys Ala Gly Ala Gin Set Arg Phe 

2085 2090 2095 

GAT ACC TAC GCC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA .\AC 63 36 
Asp Ser Tyr Gly Lys Leu T/i Asp Glu Asn He Asn Ala Gly Glu Asn 
U) 2100 2105 2110 



15 



35 



55 



CAA GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT 6334 

Gin Ala Mec Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 643 2 

Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2X35 2140 



20 TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG 6480 

Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 

2145 2150 2155 2160 

AC A GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG 6528 

25 Thr Gly Tyr Vai Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 

2165 2170 2175 

GAT AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG 657 6 

Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
30 2180 2185 2190 



GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT 6624 
Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 2205 

CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA 667 2 
Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 



40 ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 6720 
Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
2225 2230 2235 2240 

CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 67 68 
45 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 

2245 2250 2255 

CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 63 i 6 
Arg Leu Ala Ala He ryr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
50 2260 2265 2270 



TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 6864 
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2285 

GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 6912 
Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 



60 CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT 6960 

Leu Ala Gly Glu Thr Leu Mec Leu Ser Leu Ala Gin Mec Glu Asp Ala 

2305 2310 2315 2320 

CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC AC A GTA TCG 7008 

65 His Leu Lys Arg Asp Lys Arg Ala Leu Giu Val Glu Arg Thr Vai Ser 

2325 2330 2335 

CTG GCC GAA CTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC 7055 

Leu Ala Giu Vai Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 

70 2340 2345 2350 
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40 



CTC GCT CAG C.^Ji ATT CAC AAG CTG GTC AGT CAA GGT TCA CGC ACT GCC "i>4 

Ldu Ala Gin Giu He A5p Lys L«u Val Ser Gin Gly Ser Gly Ser 
2355 2360 2365 

CGC AGT GGT AAT AAT AAT* TTG GCG TTC GCC GCC GGC ACG GAC ACT AAA 7152 
Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 2375 2330 

ACC TCT TTG CAG GCA TCA CTT TCA TTC GCT GAT TTG .J^AA ATT CGT GAA 7200 
Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Giu 
2385 2390 2395 2400 

GAT TAC CCG CCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC 7248 
Asp Tyr Pro Ala ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 

2405 2410 2415 

CTC ACT TTC CCC CCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA 7296 
Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 

2420 2425 2430 

TTG TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GCC TCT GAA GCG CTG 7344 
Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Giu Ala Leu 
2435 2440 2445 

GCA CTT TCT CAC CGT ATC AAT GAC AGC CGC CAA TTC CAG CTC GAT TTC 7392 
Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Cln Leu Asp Phe 
2450 2455 2460 

AAC GAT GCC AAA TTC CTG CCA TTC GAA CCC ATC GCC ATT GAT CAA CGC 7440 
Asn Asp Gly Lys Phe Leu Pro Phe Clu Gly He Ala He Asp cln Gly 
2465 2470 2475 2430 

ACG CTG ACA CTG AGC TTC CCA AAT GCA TCT ATC CCG GAC AAA CGT AAA 7488 
Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Clu Lys Gly Lys 

2485 2490 2495 

CAA GCC ACT ATC TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CCC 7536 
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu Hxs He Arg 

2500 2505 2510 



TAC ACC ATT AAA TAA 
Tyr Thr He Lys 

2516 



7551 



45 



50 



55 



12) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 2516 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47 (TcdAI : 



6() 



65 



Features 


From 


To 


Descri 


ption 




Peptide 


1 


2516 


TcdA 


proteins 


Peptide 


89 


1937 


TcdAii 


peptide 


Fragment 


89 


100 


S2 N< 


-te 


rminus (SEO ID NO: 13) 


Fragment 


284 


299 


(SEQ 


ID 


NO: 


38) 


Fragment 


554 


563 


(SEQ 


ID 


NO: 


17) 


Fragment 


1080 


1092 


(SEQ 


ID 


NO: 


23; 12/13) 


Fragment 


1385 


1400 


(SEQ 


ID 


NO: 


16) 


Fragment 


1478 


1497 


(SEQ 


ID 


NO: 


39) 


Fragment 


1620 


1642 


(SEQ 


ID 


NO: 


21; 19/23) 


Fragment 


1938 


1948 


(SEQ 


ID 


NO: 


41) 


Pept Lde 


1938 


2516 


TcdAj 


. i i 


peptide 


Fragment 


2327 


2345 


(SEQ 


ID 


NO: 


42) 


Fragment 


2398 


2403 


(SEQ 


ID 


NO: 


43) 
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20 



Met Asn Glu Ser Val Lys Glu lie Pro Asp Val Leu Lys Ser Gin Cys 

^ 10 i5 

5 Gly Phe Asn Cys Leu Thr Asp He Ser His Ser Ser Phe Asn Giu Phe 

20 25 3Q 

Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 

10 " *° «5 

Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu KU 

55 60 

15 65° 7o* ^""^ **" Ala Val His Leu 

"5 ao 

Ala He Leu Ala Pro Asn Ala Glu Leu He Gly T/z Asn Asn Gin Phe 

85 90 95 

ser Gly Arg Ala Ser Gin Tyr Val Ala Pro cly Thr Val Ser Ser Met 

100 105 110 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg >sn 

•I*' 120 125 

Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 

135 140 

30 145 ilo ^" ''^^ 2'*'^ 

Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu 

170 

ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 

185 

Ary Pro Ser Gly Ala Thr Pro Tyr His Asp Ala lyr Glu Asn Val Arg 
40 200 205 

Glu Val He Gin -,u Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 

215 220 

Ala Ser Leu Leu Gly lie Asn 

^ 230 235 240 

Ala Ser He Ser Pro Glu Leu Phe Asn lie Leu Thr Glu Glu He Thr 

245 250 255 



25 



35 



50 



55 



65 



70 



Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 

260 265 



270 



Pro Ala Ser Leu Ala Mec Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
275 280 285 

Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 
-^'O 295 300 

60 3 05 ^^"^ '^'^ no ^^'^ ^^"^ 111 ^^"^ 

Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 

325 330 335 

Asn Ala Tyr Gin Mec Asp val Glu Leu Phe Pro Phe Gly Gly Glu Asn 

345 

Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
355 360 365 

Ser He Lys Leu Asn Asp Lys Arg Giu Leu Val Arg Thr Glu Gly Ala 
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370 575 330 

Pro Gin Vai Asn lie ciu Tyr ser Ala Asn lie Thr Leu Asn Thr Aia 
335 390 3&5 400 

Asp lie Ser Gin Pro Phe Giu lie Giy Leu Thr Arg Vai Leu Pro Ser 

405 4i0 415 

Giy Ser Trp Aia Tyr Aia Aid Aia Lys Phe Thr Vai Giu Giu Tyr Asn 

420 425 430 

Gin Tyr ser Phe Leu Leu Lys Leu Asn Lys Aia lie Arg Leu Ser Arg 
435 440 445 

Aia Thr Giu Leu Ser Pro Thr lie Leu Giu Giy lie Vai Arg Ser Vai 
450 455 460 

Asn Leu Gin Leu Asp lie Asn Thr Asp Vai Leu Giy Lys Vai Phe Leu 
465 470 475 430 

Thr Lys Tyr Tyr Met Gin Arg Tyr Aia lie His Aia Giu Thr Aia Leu 

435 490 495 

lie Leu Cys Asn Aia Pro lie Ser Gin Arg Ser T/r Asp Asn Gin Pro 

500 505 510 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Giy Gin Tyr 
515 520 525 

Phe Ser Thr Giy Asp Giu Giu He Asp Leu Asn Ser Giy Ser Thr Giy 
530 535 540 

A.sp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Vai 
545 550 555 560 

Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Giy 

565 570 575 

Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He Giy Lys 

580 585 590 

Leu Leu Ala Asp He His Gin Leu Thr He Asp Giu Leu Asp Leu Leu 
595 600 605 

Leu He Ala Vai Giy Giu Giy Lys Thr Asn Leu Ser Ala He Ser Asp 
610 615 620 

Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 
625 630 635 640 

Leu His Thr Gin Lys Trp Ser Vai Phe Gin Leu Phe He Mec Thr Ser 

645 650 655 

Thr Ser Tyr Asn Lys Thr Leu Thr Pro Giu He Lys Asn Leu Leu Asp 

660 665 670 

Thr Vai Tyr His Giy Leu Gin Giy Phe Asp Lys Asp Lys Ala Asp Leu 
675 680 685 

Leu His Vai Mec Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 
690 695 700 

Giu Asn Vai Aia His Ser Vai Leu Leu Trp Ala Asp Lys Leu Gin Pro 
705 710 715 720 

Giy Asp Giy Aia Mec Thr Ala Giu Lys Phe Trp Asp Trp Leu Asn Thr 

725 730 735 

Lys Tyr Thr Pro Giy Ser Ser Giu Ala Vai Giu Thr Gin Giu His lie 

740 745 750 
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Val Gin TVr cys Gin Ala L«u Ala Cln L*u Glu Met Vai T/r H^s 3*r 
•55 -?60 

Thr Gly He Asn Glu Asn Ala Ph« Arg Lau Phe Val Thr Lys Pro Glu 
J ''0 775 780 

Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Uu S^r 

Leu He Mec Leu Thr Arg Phe Ala Asp Trp VaI Asn Ala Leu Gly Glu 

805 310 

Lys Ala Ser Ser Val Leu Ala Ala Phe Glu AU Asn Ser Leu Thr \U 
15 ®20 325 

Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu cln 

840 

20 850 ^^"^ 85? ^^'^ 



lU 



25 



860 

Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr lie Leu Gin Tro 
8o5 870 875 330 

Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser 

885 890 3S5 

Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 

Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
915 920 925 

IS Sift '^^^ ^^"^ "^^ '^P Ser Arg Ser Ala 

930 935 5^0 



30 



Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 

950 955 



960 



40 



45 



He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 

970 975 

Val ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 

980 985 



990 



He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
^35 1000 1005 

50 1010^^^ 1015°^" 1020^''*' '^^^ 

Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tvr 
^025 i030 1035 



1040 



55 



Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 

1045 1050 1055 

Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 

i060 1065 1070 

Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 
1075 1080 1085 

tn^A^"*" ''^^ ^^"^ '^^ Asp Asn He Asn Asn Asp 

"J 1090 1095 1100 



60 



Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly clu 
1105 1110 1115 



1120 



70 



Tyr Tyr Trp Arg ^er^Val Asp His Ser ^^s^Phe Asn Asp Gly Lys Phe 



1135 
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Ma Ala ksn Ala Trp Ser Glu Trp His Lys lid Asp Cys Pro 11^ Asn 

1140 1X45 X150 

5 Pro Tyr Lys Ser Thr Il« Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 

XiS5 1X60 1X65 



10 



25 



40 



55 



70 



Leu Leu Trp Leu GXu GXn Lys Glu IXe Thr Lys Gin Thr GXy Asn Ser 
XX70 XX75 XX80 

Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1135 1190 1195 1200 



Ala His lie Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 
15 1205 1210 1215 

Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 

1220 1225 1230 

20 Gly Leu Tyr Cys Ala Gly Tyr GXn GXy GXu Asp Thr Leu Leu VaX Met 

X235 X240 X245 



Phe Tyr Asn GXn GXn Asp Thr Leu Asp Ser Tyr Lys Asn AXa Ser Met 
X250 X255 1260 

Gin Gly Leu Tyr lie Phe Ala Asp Met Ala Ser Lys Asp Mec Thr Pro 
1265 1270 1275 1280 



Glu Gin Ser Asn val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 
30 1285 1290 1295 

Asn Asn VaX Arg Arg VaX Asn Asn Arg Tyr AXa GXu Asp Tyr Glu He 

1300 1305 X3X0 

35 Pro Ser Ser VaX Ser Ser Arg Lys Asp Tyr GXy Trp Gly Asp Tyr Tyr 

1315 1320 1325 



Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 
1330 1335 1340 

Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg He He 
1345 1350 1355 1360 



His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Mec Asn 
45 1365 1370 1375 

Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 

1380 1385 1390 

50 VaX Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro VaX Tyr 

139S X400 X405 



GXn Tyr Ser GXy Asn Thr Ser GXy Leu Asn GXn GXy Arg Leu Leu Phe 
X4X0 X4X5 1420 

His Arg Asp Thr Thr Tyr Pro Ser Lys Val Giu Ala Trp He Pro Giy 
1425 X430 1435 1440 



Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
60 X445 X450 1455 

Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 

1460 1465 1470 

65 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 

1475 1480 1485 



He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys AU 
1490 1495 1500 

Giy Gly Lys Giu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
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5 



iSiO 1 = 15 ;5^.^ 

Pro Set Pro s«r Phe Asp Glu Asn T/r Gin Ph^ Asn Ala Ltu 3lu 

A525 1530 1535 

Zle Asp Gly Ser Gly L«u Asn Phe He Asn Asn Ser Aia Ser lie Asd 

1540 1545 1550 

Vdi Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly T\'r Glu 
i555 1560 1565 



15 



Ser Phe -er lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 
A^'O 1575 1580 

Thr Leu His His Asn Clu Asn Gly Ala Gin T/r Met Gin Trp Gin Ser 
^^^5 i590 1595 1600 

Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Aro 

i605 1610 1615 

Ala Thr Thr Gly lie Asp Thr lie Leu Ser Met Glu Thr Gin Asn lie 

^620 1625 1630 

Gin Glu fro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val lie Pro 
-J 1635 1640 1645 

r650"'^^" '^^^ 1655^^^ 



20 



X660 



30 



35 



45 



lie Lys His Val Val Asp Asn Asn Ser His lie lie Tyr Ser Gly Gin 
A665 1670 1675 i680 

Leu Thr Asp Thr Asn lie Asn lie Thr Leu Phe lie Pro Leu Asp Asd 

1685 1690 1695 

Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 

i700 1705 1710 

Lys Ser fro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
*H/ 1715 1720 1725 

Asp Lys Gly lie Val Thr lie Asn Pro Lys Ser lie Leu Thr His Phe 
i'30 1735 ^74p 

Glu Ser Vai Asn Val Leu Asn Asn lie Ser Ser Glu Pro Met Asp Phe 
i745 1750 1755 ^^^^ 

Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 

i765 1770 1775 

Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 

1780 1785 1790 

Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Vai His 
1795 1800 1805 

'rrp Asn Val Arg Pro Leu Leu Glu Asp 
^310 1315 ' 1820 

To^c^®^ '^^^ Asp Ser Val Asp Pro Asp Ala Val 

^^^^ 1830 1335 1340 

Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 

i845 1850 1855 

Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 

I860 1865 1870 

Arg Asp Jhr Leu Asn Giu Ala Lys Met Trp Tyr Met Gin Ala Leu His 
1875 1880 1835 



50 



65 
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L^u L*u Civ Asp Lys Pro T/r Leu Pro Leu Ser Thr Thr Trp Ser Asp 
1390 1855 WOO 

pro ^rg Leu Asp Arg AU Ala Asp He Thr Thr Gin Asn Ala His Asp 
5 1905 1310 1915 1920 

Ser Ala lie Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 

1925 1930 1935 

10 f^er Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 

1940 1945 1950 



IS 



30 



45 



6<) 



Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Vai Tyr 
1955 I960 1965 

Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Tyr Leu Pro 

1970 1975 1980 



He T/r Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
20 1985 1990 1995 2000 

Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 

2005 2010 2015 

25 Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 

2020 2025 2030 



Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp 
2035 2040 2045 

Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 

2050 2055 2060 



Leu Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala 
35 2065 2070 2075 2080 

Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 

2085 2090 2095 

40 Asp ser Tyr Gly Lys Leu Tyr Asp Glu Asn Ho Asn Ala Gly Glu Asn 

2100 2105 2110 



Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 



Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
50 2145 2150 2155 2160 

Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 

2165 2170 2175 

55 Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 

2180 2185 2190 



Giu He Gin Arg Asn Asn Ala Giu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 2205 

Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 



Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
65 2225 2230 2235 2240 

Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 

2245 2250 2255 

70 Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 

2260 2265 2270 
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Cys Leu Mec Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp s^r 
2275 2280 2285 

5 Ala Arg Phe Zle Lys Pro Ciy Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 



10 



Leu Ala Gly Glu Thr Leu Hec Leu Ser Leu Ala Gin Met Glu Asp Ala 
2305 2310 2315 2320 

His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 

2325 2330 2335 



Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
IS 2340 2345 2350 

Leu Ala Gin Glu lie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 

20 Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 2375 2380 



25 



40 



50 



Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Glu 
^385 2390 2395 2400 

Asp Tyr Pro Ala Ser Leu Gly Lys lie Arg Arg He Lys Gin He Ser 

2405 2410 2415 



Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala lie 
30 2420 2425 2430 

Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
2435 2440 2445 

35 Ala Val Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 



Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly lie Ala He Asp Gin Gly 
2465 2470 2475 2480 

Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 

2485 2490 2495 



Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 
45 2500 2505 2510 



Tyr Thr He Lys 

2516 



(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5547 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(iij MOLECULE TYPE: DNA (genomic) 
60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 (tcdAn coding region): 



CTG ATA GGC TAT AAC AAT CAA TTT 
Leu He Gly Tyr Asn Asn Gin Phe 
65 1 5 

GCC CCG GGT ACC GTT TCT TCC ATG 
Ala Pro Gly Thr Val Ser Ser Met 

20 



AGC GGT AGA GCC AGT CAA TAT GTT 48 
Ser Gly Arg Ala Ser Gin Tyr Val 
10 15 

TTC TCC CCC GCC GCT TAT TTG ACT 96 
Phe Ser Pro Ala Ala Tyr Leu Thr 
25 30 
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V.AA CTT TAT CGT GAA CCA CGC AAT TTA CAC GCA .VGT GAC TCC CTT TnT liJ 
oiu L«u TVr Arg Glu Ala Arg Asn Leu His Ala Ser Asp ser vTl T-/r 

40 45 

TAT CTG GAT ACC CGC CGC CCA GAT CTC AAA TCA ATC CCG CTC ACT C4G 1<5> 
Tyr Leu Asp Thr Arg Arg Pro Asp L«u L/s ser M»t Xi* Gin 

55 50 

C.^ AAT ATG GAT ATA GAA TTA TCC ACA CTC TCT TTC TCC AAT GAC CTG :>sn 
Gin Asn M.C Asp li« Glu Lau Ser Thr L«u s*r j^J clu 

° ~° 75 ao 

TTA TTG GAA ACC ATT AAA ACT GAA TCT AAA CTC CAA AAC TAT ACT AAA 2aa 
15 L*u L«u Ciu set lU Lys Thr Glu Ser Lys Lau cYu ?JJ S J^J 

85 90 95 

w*^ ^''^ '^■''■^ '^'^'^ A^'T TTC CGT CCT TCC CGC CCA ACG CCT TAT l-ss 

val M« Glu M« Leu Ser Thr Phe Arg Pro s»r ^ Ala w5 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



CAT GAT OCT TAT GAA AAT GTG CGT GAA GTT ATC CAG CTA CAA GAT CCT 
HIS Asp Ala TVr Clu Asn Val Arg Clu Val He Gin Zlu cS A^ p" 

120 125 

GGA CTT CAC CAA CTC AAT GCA TCA CCG GCA ATT GCC GGG TTG ATT pit a-i-. 
Gly Leu Clu Cln Lau Asn Ala Sar Pro Ala ill Ala ^ i^IS mI? Sxs 

135 

V^^ AAC CCT TCA ATC TCC CCT GAC CTA TTT dSO 

Cln Ala S«r L«u Leu Cly He Asn Ala Ser He Ser Pro Su lIu 

150 155 

AAT ATT CTC ACC GAG GAC ATT ACC GAA OCT AAT CCT GAG GAA CTT TAT 528 
Asn He Leu Thr Glu Clu He Thr Clu Cly Asn Ala CiS Glu Leu 

165 170 

J?S JJI JSI ill ct* CCC TCA TTC CCT ATC CCG GAA 576 

i-ys uys Asn pne ciy Asn He Clu Pro Ala Ser Leu Ala Met Pro Glu 

185 190 

TAC CTT AAA CGT TAT TAT AAT TTA ACC GAT GAA GAA CTT ACT CAG TTT 654 
Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Clu SI cln K * 
'■'^ 200 205 

lH Glv ^ ^'^ IT "^^^ TAT ACT AAT AAC CAA 672 

He Gly Lys Ala Ser Asn Phe Gly Gin Oln Glu Tyr Ser Asn Asn Gin 

215 220 

lII fH Thl Pro vlt SI? ^""^ ^^"^ °" TAT 720 

^eu ii« Thr Pro Val Val Asn Sar Ser Asp Gly Thr Val Lys Val Tyr 

A?S ill Thr fr^ r'^ l^"^ t^" TAT CAA ATX3 GAT CTC GAG 768 

Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val clu 

250 255 

fl^ IP" ""^ °^*^ '^T CAG AAT TAT CCG TTA GAT T^T AAA TTC AAA a i <: 

Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arg Leu m JJJ 

265 270 

m SI Itl f?^ 1^"^ "A AAC TTA AAT GAT AAA AGA 364 

Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg 

CAA CTT CTT CCA ACT CAA CGC CCT CCT CAA GTC AAT ATA GAA T4C TCC 915 
Clu Leu val Arg Thr Glu Gly Ala Pro Gin Val Asn He cK ?Jr ler 

GCA AAT ATC ACA TTA AAT ACC GCT CAT ATC ACT CAA CCT TTT GAA ATT 960 
Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro gTS lU 
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310 



GGC CTC ;vcA CCA GTA CTT CCT TCC GGT T'-T rr^- -r-^ 

Cix Thr Ar, val Leu Pro lej ^ ^ 



330 



AAA TTT ACC CTT CAA GAG TAT Vftc CAA T'/- <tv.« 

LV. P». Th. VJ. Ciu ^ CJJ CTT .C. 

s s s Si s s sj; n; - - - s 



1:> CTG CAA GGC ATT GTG CGC AGT GTT AAT CTX r^ii r-»v. 

c:„ a. v., s„ S S SJi S S JJS 

380 

:o ^ 5if Hi SIT 25 - - - - 



400 



25 



JIT s s s s K s jis s 

410 

CAA OCT TCA TAT CAT AAT CAA ccr Arr r-A* 

«, s„ jv; *.p HI §}i S S K HI 

30 430 



35 



tS S S iJS S?J St5 Z i^'r ^ I''' 

435 440 ^ 

S JJJ S S S S Si ^ T ^ 

450 i?c ^ ^ lie Leu Lys 

s 2: - s S iii - s 

s s SI s £^ S s {h S zi; 



45 



50 505 



ss 2j s 2s ir. rd 5it ^ '»« 

520 



55 



ThJ X"^; E I« III ^T. f ^ ."^^ ^^"^ A-I-C AGA 163 2 

530 * f?5 Asp Lys cin L«u Ala Thr Leu He Arg 

535 540 

l!?^ L^ iSJ is sS ?S Lli Hf ^^^'^ ^^-^ °TA 1680 

545 ^""^ ^'^P "is Thr Gin Lys Trp Ser Val 

555 5^0 

,3 StS S SI ?^ - - «C «J c„ ^ 

s §ti ^: ™ ™ - ?s ^° - „c CCT ™ Z CCT 

70 585 590 

TTT GAT AAA CAC AAA GCA CAT TTG CTA CAT CTC ATC GCC CCC TAT ATT 1324 
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Phe Asp L/s Asp Lys Aid Asp Lau L«u His Val Mec Ala Pro Tyr Il« 
555 600 605 

GCG GCC ACC TTG CAA TTA TCA TCC GAA AAT CTC GCC CAC TCG GTA CTC 1372 
Ala Ala Thr Lou Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Ldu 
610 615 620 

CTT TCG GCA GAT AAG TTA CAG CCC GGC GAC GGC GCA ATG ACA GCA GAA 1920 
Leu Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Mec Thr Ala Glu 
z2^ 63G 635 640 

AAA TTC TGG GAC TGG TTG AAT ACT AAG TAT ACG CCG CGT TCA TCG GAA 1S63 
Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 

645 650 655 

GCC GTA GAA ACG CAG GAA CAT ATC CTT CAG TAT TGT CAG CCT CTC GCA 2016 
Ala Val Glu Thr Gin Glu His He Val Gin Tyr O/s Gin Ala Leu Ala 

660 665 670 

CAA TTG GAA ATG CTT TAC CAT TCC ACC GGC ATC AAC GAA AAC GCC TTC 2064 

Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe 
675 680 685 

CGT CTA TTT GTC ACA AAA CCA GAG ATG TTT GGC GCT GCA ACT GGA GCA 2112 
Arg L«u Phe Val Thr Lys Pro Giu Mec Phe Gly Ala Ala Thr Gly Ala 
690 695 700 

GCG CCC GCG CAT GAT GCC CTT TCA CTG ATT ATG CTG ACA CGT TTT GCG 2160 

Ala Pro Ala His Asp Ala Leu Ser Leu He Mec Leu Thr Arg Phe Ala 

705 710 715 720 

GAT TGG GTG AAC GCA CTA GGC GAA AAA GCG TCC TCG GTG CTA GCG GCA 2208 

Asp Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala 

725 730 735 

TTT GAA GCT AAC TCG TTA ACG GCA GAA CAA CTG GCT GAT GCC ATC AAT 2256 
Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Mec Asn 

740 745 750 

CTT GAT GCT AAT TTG CTG TTC CAA GCC ACT ATT CAA GCA CAA AAT CAT 23 04 
Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 
755 760 765 

CAA CAT CTT CCC CCA CTA ACT CCA CAA AAT CCC TTC TCC TGT TCC ACA 2352 
Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr 
770 775 780 

TCT ATC AAT ACT ATC CTG CAA TGG CTT AAT GTC GCA CAA CAA TTG AAT 2400 
Ser He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 

785 790 795 800 

GTC GCC CCA CAG GGC CTT TCC GCT TTC CTC GOG CTC CAT TAT ATT CAA 2443 
Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 

805 810 315 

TCA ATG AAA GAG ACA CCG ACC TAT GCC CAG TGG GAA AAC GCG GCA GGC 2496 
s^r Mec Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 

820 825 830 

CTA TTA ACC GCC GGG TTG AAT TCA C.\k CAC GCT .AAT ACA TTA CAC CCT 2544 
Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 
835 840 845 

TTT CTG GAT GAA TCT CCC ACT GCC GCA TTA ACC ACC TAC TAT ATC CGT 2592 
Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr T^'r He Arg 
350 855 860 

C.-A CTC GCC .AAG GCA GCG GCG GCT ATT AAA AGC CGT CAT GAC TTG TAT 2640 
Cln Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 
365 370 375 380 
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in 



20 



30 



40 



45 



30 



60 



fi5 



70 



C,iA T.-.C TTA CTC ATT GAT ?.\T CAG GTT TCT CCG GCA ATA AA^ -CC '-i- 
Gin Tyr Leu Lmii lU Asp Asn Gin V4I S*r Ala Ala lie Lys Thr "^hr 

335 890 355 

CGG ATC GCC GAA GCC ATT GCC AGT ATT Ck.\ CTG TAG 3TC AAC COG CCA ''''i 
Arg lie Ala Glu Aid lie Ala Ser He Gin Leu Tyr Val Asn Ma Ala 

500 505 910 " 

TTG GAA AAT CTG GAA G.AA AAT GCC A.=iT TCG GGG GTT ATC ACC ^GC C^i '-34 
Leu Glu Asn Val Giu Glu Asn Ala Asn Ser Gly Val lie Ser Arg Gin " 
515 920 92S 



TTC TTT ATC GAC TGG GAC .^AA TAC AAT AAA CGC TAC AGC ACT TCG CCG 23i ^ 
15 930 935 '^^ ^^"^ I'^l '^^^ "^^P 



vIT Ifl ?^ I""^ ^^"^ '^AT ATT GAT CCG ACC 2330 

Gly Val Ser Gin Leu Val Tyr Tyr Pro Giu Asn Tyr He Asp Pro Thr 

550 955 960 

ATC CCT ATC GCA CAA ACC AAA ATC ATC GAC GCA TTA CTC CAA TCC GTC ''923 
Mec Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Vai " 

565 970 575 

25 AGC C.=^ AGC CAA TTA AAC GCC GAT ACC CTC CAA GAT GCC TTT ATG TCT "^576 
p«r Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Aia Phe Mec Ser " 

530 985 

TAT CTC ACA TCG TTT CAA CAA CTC CCT AAT CTT AAA GTT AIT ACC CCA 3 024 
Ti-r Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val lie Ser Aia 
555 1000 1005 

TAT CAC GAT AAT ATT AAT AAC GAT CAA CCG CTC ACC TAT TTT ATC CCA 3072 

''^ ^if^'^*^ G^*^ Leu Thr Tyr Phe lie Gly 

iOiO 1015 1020 

CTC AGT GAA ACT GAT GCC GGT GAA TAT TAT TCG CGC AGT GTC CAT C\C 3120 
Leu ser Glu Thr Asp Aia Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 
^^25 1030 1035 1040 

AGT AAA TTC AAC GAC GGT AAA TTC CCG CCT AAT GCC TGG AGT GAA TCG 3153 
i>er Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 

^►045 1050 1055 

CAT AAA ATT GAT TCT CCA ATT AAC CCT TAT AAA AGC ACT ATC CGT CCA 3216 
His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 

1060 1065 iQ7Q 

tT^ I^*^ ^ ^'^ '^A'T CTC CTC TGG TTC GAA CAA AAG GAG 3264 

Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
i075 1080 1085 

ATC ACC AAA CAG ACA GGA AAT AGT AAA GAT GGC TAT CAA ACT GAA ACG 3 312 

« T^f«^^* Asp Gly Tyr Gin Thr Glu Thr 

1090 1095 1100 



GAT TAT CCT TAT CAA CTA AAA TTC GCC CAT ATC CGC TAT GAT GGC ACT 3 360 

•^??c^^ '^^"^ Leu Ala His He Arg Tyr Asp Gly Thr 

1105 1110 1115 ^^20 

TGG AAT ACG CCA ATC ACC TTT CAT GTC AAT iJ^A AAA ATA TCC GAG CTA 3403 
Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Giu Leu 

1125 1130 1135 

.-A?^ CTC CAA AAA AAT ACA CCG CCC CCA CTC TAT TCT GCC GGT T^T CA.n 3 4=5 
Lvs Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly T/r Gin 

1140 1145 1150 

GGT GAA GAT ACG TTC CTC CTC ATC TTT TAT AAC CAA CAA CAC \CA CT^ 3 504 

Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 
1155 1160 1165 
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CAT ACT TAT AAA AAC CCT TCA ATC C^A GGA CTA TAT ATC TTT GCT GAT 3552 
Asp Ser Tyr Lys Asn Ala S^r Mac Gin Gly L«u Tyr lie Phe Ala Asp 
1170 ilT5 1180 

5 

ATC CCA TCC AAA GAT ATG ACC CCA GAA CAG AGC AAT GTT TAT CGG GAT 3600 
MdC Ala Ser Lys Asp Met Thr Pro Giu Gin s«r Asn Val Tyr Ar? Asp 
lldS 1190 1195 1200 

10 ?J^r AGC TAT CAA C.\^ TTT GAT ACC ,^T r-AT GTC AGA AGA OTG AAT .^AC 3643 
Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 

1205 1210 1215 

CGC TAT GCA GAG GAT TAT GAG ATT CCT TCC TCG GTA ACT AGC CGT .^AA 3696 

15 Arg Ti'r Ala Giu Asp Tyr Giu lie Pro Ser Ser Val Ser Ser Arg Lys 

1220 1225 1230 

GAC TAT GGT TGG GGA GAT TAT TAC CTC AGC ATG GTA TAT AAC GGA GAT 3744 
Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 
20 1235 1240 1245 



25 



45 



ATT CCA ACT ATC AAT TAC AAA GCC GCA TCA AGT GAT TTA AAA ATC TAT 3792 

lie Pro Thr lie Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys lie Tyr 
1250 1255 1260 

ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT GGA TAT GAA GGA CAG AAG 3340 

lie Ser Pro Lys Leu Arg lie lie His Asn Gly Tyr Giu Gly Gin Lys 

1265 1270 1275 1280 



30 CGC AAT CAA TGC AAT CTG ATG AAT AAA TAT CGC AAA CTA GGT GAT AAJ^ 3883 
Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 

1285 1290 1295 

TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT CCA AAT AAC TCG TCA AAT 3936 
35 Phe lie Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 

1300 1305 1310 

AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT AGC GCA AAC ACC AGT GGA 3984 

Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 
40 1315 1320 1325 

CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 4032 
Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr T^'r Pro Ser 
1330 1335 1340 



AAA GTA GAA GCT TGG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC C.Vt 4080 
Lys Vai Giu Ala Trp lie Pro Giy Ala Lys Arg Ser Leu Thr Asn Gin 
1345 1350 1355 1360 



50 AAT GCC GCC ATT GGT GAT GAT TAT GCT AC A GAC TCT CTG .AAT AAA CCG 4128 
Asn Ala Ala lie Giy Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 

1365 1370 1375 

GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC AGT AAA GGG ACT GCT 417 6 
55 Asp Asp Leu Lys Gin Tyr lie Phe Met Thr Asp Ser Lys Gly Thr Ala 

1380 1385 1390 

ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4224 

Thr Asp Vai Ser Gly Pro Vai Giu lie Asn Thr Ala lie Ser Pro Ala 
60 1395 1400 1405 

AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC KkG GAG CAA ACT TTT ACC 4 2 "7 2 

Lys Vai Gin lie lie Vai Lys Ala Giy Gly Lys Giu Gin Thr Phe Thr 
1410 1415 1420 

fi5 

GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4320 

Ala Asp Lys Asp Val Ser lie Gin Pro Ser Pro Ser Phe Asp Giu Met 

1425 1430 1435 1440 

70 AAT TAT CAA TTT AAT GCC CTT GAA ATA GAC GGT TCT GGT CTG AAT TTT 43 68 
Asn Tyr Gin Phe Asn Ala Leu Giu li^ Asp Gly Ser Gly Leu Asn Phe 
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tu 



30 



50 



70 



1445 i450 1455 

ATT k^C AAC TCA CCC ACT ATT GAT GTT ACT TTT ACC GCA TTT GCC GAG 4416 
lie Asn Asn S«r Ala Ser lie Asp Vai Thr Phe Thr Aid Phe Aia Giu 

1460 1465 147C 

GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC AGT ATT CCT GTT ACC CTC 4464 

Asp Giy Arg Lys Leu Gly P/r Giu Ser Phe Ser lie Pro Val Thr Leu 
1475 1480 1485 

?JxG GTA AGT ACC GAT AAT CCC CTG ACC CTG CAC CAT .\.AT GAA AAT GGT 4512 
Lys Vai Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Giu Asn Gly 
1490 1495 1500 



15 GCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT CTA 4560 

Ala Gin Tyr Mec Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 
1505 1510 1515 1520 

TTT GCC CCC GAG TTC GTT CCA CCC CCC ACC ACC CCA ATC CAT AC A ATT 4603 

20 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly lie Asp Thr lie 

1525 1530 1535 

CTG AGT ATG GAA ACT CAG AAT ATT CAC GAA CCC CAG TTA GGC AAA GGT 4656 

Leu Ser Mec Clu Thr Gin Asn lie Cln Clu Pro Cln Leu Gly Lys Gly 
25 1540 1545 1550 

TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 4704 

Phe Tyr Ala Thr Phe Val lie Pro Pro Tyr Asn Leu ser Thr His Giy 

1555 1560 1565 



GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT AAT 4752 

Asp Giu Arg Trp Phe Lys Leu Tyr lie Lys His Val Val Asp Asn Asn 
1570 1575 1580 



35 TCA CAT ATT ATC TAT TCA GGC CAG CTA ACA GAT AC A AAT ATA AAC ATC 4800 

Ser His lie lie Tyr Ser Giy Gin Leu Thr Asp Thr Asn lie Asn lie 
1585 1590 1595 1600 

ACA TTA TTT ATT CCT CTT GAT GAT GTC CCA TTG AAT CAA GAT TAT CAC 4343 
40 Thr Leu Phe lie Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 

1605 1610 1615 

GCC AAG GTT TAT ATG ACC TTC AAG AAA TCA CCA TCA GAT GGT ACC TGG 489 6 

Ala Lys Val Tyr Mec Thr Phe Lys Lys Ser Pro Ser Asp Giy Thr Trp 
45 1620 1625 1630 

TGG GGC CCT CAC TTT GTT AGA GAT GAT AAA GGA ATA GTA ACA ATA AAC 4944 
Trp Giy Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr lie Asn 
1635 1640 1645 



CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4992 
Pro Lys Ser He Leu Thr His Phe Giu Ser Val Asn Val Leu Asn Asn 
1650 1655 1660 



55 ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 5040 

lie Ser Ser Giu Pro Mec Asp Phe ser Giy Ala Asn Ser Leu Tyr Phe 
1665 1670 1675 1680 

TGG GA-A CTG TTC TAC TAT ACC CCG ATC CTG CTT CCT CAA CGT TTG CTG 5088 

AO Trp Giu Leu Phe Tyr Tyr Thr Pro Mec Leu Val Ala Gin Arg Leu Leu 

1685 1690 1695 

CAT CAA CAC AAC TTC GAT GAA GCC AAC CCT TGG CTG AAA TAT GTC TGG 5136 

His Giu Gin Asn Phe Asp Giu Ala Asn Arg Trp Leu Lys Tyr Val Trp 
65 1700 1705 1710 

AGT CCA TCC CGT TAT ATT GTC CAC CCC CAC ATT CAC AAC TAC CAC TGG 5134 

Ser Pro Ser Cly Tyr lie Val His Gly Gin lie Gin Asn T^'r Cln Trp 
1715 1720 1725 



AAC GTC CGC CCG TTA CTG GAA CAC ACC ACT TGG AAC AGT GAT CCT TTG 52i2 
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Asn Vai Arg Pro L«u Leu Glu Asp Thr Scr Trp Asn Sex Asp Pro Leu 
1730 1735 1740 

GAT TCC CTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 5230 
Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Mec His T/r 
1745 1750 1755 1760 

AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTG ATA GCA CGC GGC 53 23 
L/3 Val Ser Thr Phe Mec Arg Thr Leu Asp Leu Leu lie Ala Arg Gly 

1765 1770 1775 

GAC CAT OCT TAT CGC CAA CTG GAA CGA GAT AC A CTC AAC CAA GCG AAG 5376 
Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 

1780 1785 1790 

ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 5424 
Mec Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 
1795 1800 1805 

CCG CTG AGT ACG ACA TGG ACT GAT CCA CGA CTA GAC AG A GCC GCG CAT 5472 
Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 
1810 1815 1820 

ATC ACT ACC CAA AAT GCT CAC GAC AGC GCA ATA CTC CCT CTG CCG CAG 5520 
lie Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 
1325 1830 1335 1340 

AAT ATA CCT ACA CCC CCA CCT TTA TCA 5547 
Asn He Pro Thr Pro Ala Pro Leu Ser 

1845 1849 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1849 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE 


DESCRIPTION: SEQ IDNO:49 (TcdAii): 


Features 


From 


To 


Description 


Pepc Ide 


1 


1849 


TcdAii peptide 


Fragment 


1 


12 


S2 M-terminus {SEQ ID NO: 13) 


Fragment 


196 


211 


(SEQ ID NO: 38) 


Fragment 


466 


475 


(SEQ ID NO: 17) 


Fragment 


993 


1004 


(SEQ ID NO:23; 12/13) 


Fragment 


1297 


1312 


(SEQ ID NO: 18) 


Fragment 


1390 


1409 


(SEQ ID NO; 39) 


Fragment 


1532 


1554 


(SEQ ID N0:21; 19/23) 



Leu lie Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 
15 10 15 

Ala Pro Gly Thr Val Ser Ser Mec Phe Ser Pro Ala Ala Tyr Leu Thr 

20 25 30 

Glu Leu T/r Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 
35 40 45 

Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 
50 55 60 

Gin Asn Met Asp lie Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 
65 70 75 30 

Leu Leu Glu Ser lie Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys 
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as 



50 



95 



Vdl Mac Glu Met Leu Ser Thr Phe Arg Pro Ser Cly Ala Thr Pro Tvr 
J iOO 105 no 

His Asp Ala Tyr Glu Asn Val Arg Glu Val 11^ Gin Leu Cln Asd Pro 
1X5 120 125 

Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala He Ala Cly Leu Met His 
130 135 140 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Gin Ala ser Leu Leu Cly lie Asn Ala Ser He Ser Pro Glu Leu Phe 
1*5 ISO 155 160 

Asn lie Leu Thr Glu Glu lie Thr Glu Gly Asn Ala Glu Glu Leu Tyr 

165 170 175 

Lys Lys Asn Phe Gly Asn lie Glu Pro Ala Ser Leu Ala Met Pro Glu 

180 185 190 

Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe 
155 200 205 

He Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn cln 
210 215 220 



Leu He Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val 
225 230 235 

Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met 

245 250 

Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arg Leu Asp Tyr 

260 265 

Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn 
275 280 285 

Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn He 
290 295 300 

Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro 
305 310 315 

Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr 

325 330 

Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu 

340 345 



Lys Val Ti'r 
240 

Asp Val Glu 
255 

Lys Phe Lys 
270 

Asp Lys Arg 



Glu Tyr Ser 



Phe Glu He 
320 

Ala Ala Ala 

335 

Leu Lys Leu 
350 



70 



Asn Lys Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He 
355 360 365 

Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 
370 375 380 

Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr 

390 395 400 

Ala He His Ala Glu Thr Ala Leu He Leu cys Asn Ala Pro He Ser 

405 410 415 

Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 

420 425 430 

Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 
435 440 445 

Asp Leu Asn Ser Gly Ser Thr Cly Asp Trp Arg Lys Thr He Leu Lys 
450 455 
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Arg AU Phe Asn lU Asp Asp Val ser Leu Phe Arg Leu Leu Lys 
465 ' 

Thr ASP His ASP Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn 

435 490 

Leu Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp lie His Gin Leu 

500 505 ^'■^ 

Thr lie ASP Glu Leu Asp Leu Leu Leu lie Aia Vai Gly Giu Gly Lys 
5X5 520 525 

Thr Asn Leu Ser Ala lie Ser Asp Lys Gin Leu Ala Thr Leu lie Arg 
530 535 540 

Lys Leu Asn Thr He Thr ser Trp Leu His Thr Gin Lys Trp Ser Vai 
545 550 555 ^«>U 

Phe Gin Leu Phe lie Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 

565 570 ^'^ 

Pro Glu lie Lys Asn Leu Leu Asp Thr Vai Tyr His Gly Leu Gin Gly 

580 585 590 

Phe ASP Lys Asp Lys Aia Asp Leu Leu His Val Met Ala Pro Tyr lie 
595 600 605 

Ala Ala Thr Leu Gin Leu Ser Ser Giu Asn Vai Ala His ser Vai Leu 
610 615 620 

Leu Trp Aia Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Aia Glu 
625 630 635 640 

Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 

645 650 653 

Aia Vai Giu Thr Gin Giu His lie Val Gin Tyr Cys Gin Ala Leu Ala 

660 665 670 

Gin Leu Glu Met Vai Tyr His Ser Thr Gly lie Asn Glu Asn Aia Phe 
675 680 685 

Arg Leu Phe Vai Thr Lys Pro Giu Met Phe Gly Ala Aia Thr Gly Ala 
690 695 700 

Ala Pro Ala His Asp Aia Leu Ser Leu lie Met Leu Thr Arg Phe Ala 

705 710 715 '20 

Asp Trp Vai Asn Ala Leu Gly Giu Lys Ala Ser Ser Val Leu Ala Ala 

^ ^ 725 730 735 

Phe Giu Ala Asn Ser Leu Thr Aia Glu Gin Leu Ala Asp Ala Met Asn 

740 745 750 

Leu Asp Ala Asn Leu Leu Leu Gin Aia Ser lie Gin Ala Gin Asn His 
755 '^eo 765 

Gin His Leu Pro Pro Vai Thr Pro Giu Asn Aia Phe Ser Cys Trp Thr 
770 775 780 

ser lie Asn Thr lie Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 
785 790 795 

Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr lie Gin 

805 

ser Met Lys Giu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 

820 825 830 

vai Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Aia 
835 840 845 
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Phe L-eu Asp Giu J«r Arg Ser Ala Ala Leu 3er Thr T/r T/r lie Arg 
350 355 860 

Gin Vai Aia Lys Aia Aia Aia Aia lie Lys 3er Ara Asp Asp Leu Tyr 
5 365 370 875 380 

Gin T/r Leu Leu lie Asp Asn Gin Vai Ser Aia Aia lie Lys Thr Thr 

865 890 895 

10 Arg lie Aia Giu Aia lie Aia Ser lie Gin Leu Tyr Vai Asn Arg Aia 

900 905 9i0 



15 



30 



45 



60 



Leu Giu Asn Vai Giu Giu Asn Aia Asn Ser Giy Vai lie Ser Arg Gin 
915 920 925 

Phe Phe lie Asp Trp Asp Lys T/r Asn Lys Arg Tyr Ser Thr Trp Aia 
930 935 940 



Giy Vai Ser Gin Leu Vai Tyr T/r Pro Giu Asn Tyr lie Asp Pro Thr 
20 945 950 955 960 

Met Arg lie Giy Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Vai 

965 970 975 

25 Ser Gin Ser Gin Leu Asn Ala Asp Thr Vai Giu Asp Ala Phe Met Ser 

980 985 990 



T/r Leu Thr Ser Phe Giu Gin Vai Aia Asn Leu Lys Vai lie Ser Aia 
995 iOOO 1005 

Tyr His Asp Asn lie Asn Asn Asp Gin Giy Leu Thr Tyr Phe lie Giy 

1010 1015 1020 



Leu Ser Giu Thr Asp Ala Giy Giu Tyr Tyr Trp Arg Ser Vai Asp His 
35 1025 1030 1035 1040 

Ser Lys Phe Asn Asp Giy Lys Phe Ala Ala Asn Ala Trp Ser Giu Trp 

1045 1050 1055 

40 His Lys lie Asp Cys Pro lie Asn Pro Tyr Lys Ser Thr lie Arg Pro 

1060 1065 1070 



Vai lie Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Giu Gin Lys Giu 
1075 1080 1085 

He Thr Lys Gin Thr Giy Asn Ser Lys Asp Giy Tyr Gin Thr Giu Thr 
1090 1095 1100 



Asp Tyr Arg Tyr Clu Leu Lys Leu Ala His He Arg Tyr Asp Giy Thr 
50 1105 IXlO 1115 1120 

Trp Asn Thr Pro He Thr Phe Asp Vai Asn Lys Lys He Ser Giu Leu 

1125 1130 1135 

55 Lys Leu Giu Lys Asn Arg Ala Pro Giy Leu T/r Cys Ala Giy Tyr Gin 

1140 1145 1150 



Giy Giu Asp Thr Leu Leu Vai Met Phe Tyr Asn Gin Gin Asp Thr Leu 
1155 1160 1165 

Asp Ser Tyr Lys Asn Ala Ser Met Gin Giy Leu Tyr He Phe Ala Asp 
1170 1175 1180 



Met Ala Ser Lys Asp Met Thr Pro Giu Gin Ser Asn Vai Tyr Arg Asp 
<S5 1185 1190 1195 1200 

Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Vai Arg Arg Vai Asn Asn 

1205 1210 1215 

70 Arg T/r Ala Giu Asp T/r Giu He Pro Ser Ser Vai Ser Ser Arg Lys 

1220 1225 1230 
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Asp r/r Gly Trp Gly Asp T/r T,r Leu 5er Met Vai T/r Asn Gly Asp 
1235 1240 1245 

5 He Pro Thr He Asn Tyr Lys Ala Ala Ser ier Asp Leu Lys lie Tyr 
1250 1255 1260 



0 



He Ser Pro Lys Leu Arg He He His Asn Civ Tyr Glu Gly Gin Lys 
i255 1270 1275 1230 

Arg Asn Gin Cys Asn Leu Met Asn Lys T/r Gly Lys Leu Gly Asp Lys 

1285 1290 1295 



Phe He Val T/r Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
5 1300 1305 1310 

Lys Leu Met Phe Tyr Pro Val Tyr Gin T/r Ser Gly Asn Thr Ser Glv 
1315 1320 1325 

0 Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 
1330 1335 1340 



5 



Lys Val Glu Ala Trp Ho Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 
i^45 1350 1355 1360 

Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 

1365 1370 1375 

Asp Asp Leu Lys Gin Tyr He Phe Met Thr Asp Ser Lys Gly Thr Ala 

i380 1385 1390 

Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala 
1395 1400 1405 

Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 
1410 1415 1420 

Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Met 
1^25 1430 1435 1440 

Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 

1445 1450 1455 

He Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 

1460 1465 1470 

Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 
1475 1480 1485 

^'"^^ YfL^®^ '^^^ "Thr Leu His His Asn Glu Asn Gly 

1450 1495 150Q 

Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 
^^0^ 1510 1515 1520 

Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 

1525 1530 1535 

Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lys Gly 

1540 1545 1550 

Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 
1555 1560 1565 

^inr.^"""^ "^""P '^^ "is Val Val Asp Asn Asn 

1570 1575 |^53Q 

Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 

1590 1595 1600 

Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 
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1605 1610 1615 

Aid Lys Vai Tyr MdC Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 

1620 1625 1630 

Trp Gly Pro His Phe Vai Arg Asp Asp Lys Gly lie Vai Thr lie Asn 
1635 1640 1645 



Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 
to 1650 1655 1660 

He Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 
1665 1670 1675 1630 

15 Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 

1635 1690 1695 



20 



35 



55 



60 



His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 

1700 1705 1710 

Ser Pro Ser Gly Tyr lie Val His Gly Gin He Gin Asn Tyr Gin Trp 
1715 1720 1725 



Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 
25 1730 1735 1740 

Asp Ser Val Asp Pro Asp Ala Vai Ala Gin His Asp Pro Met His Tyr 
1745 1750 1755 1760 

30 Lys Val Ser Thr Phe Met Arg Thr Lou Asp Lou Leu He Ala Arg Gly 

1765 1770 1775 



Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 

1780 1785 1790 



Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 
1795 1800 1805 

Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 
40 1810 1815 1820 

He Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 
1825 1830 1835 1840 

45 Asn He Pro Thr Pro Ala Pro Leu Ser 

1845 1849 



(2) INFORMATION FOR SEQ ID NO: 50: 
50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 (TcdAiii coding region) 

TTG CGC ACC CCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 48 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He Asn 
i 5 10 15 • * 



65 GAA GTG ATG ATC AAT TAC TGG CAC ACA TTA GCT CAC AGA CTA TAC AAT 96 
Glu Val Mec Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Vai Tyr Asn 

20 25 30 

CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA ATC 144 
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Leu Arg His Asn Leu Set lie Asp Gly Gin Pro Leu Tyr Leu Pro lU 
35 40 45 

TAT GCC ACA CCG GCC GAT ZZC XnA GCG TTA CTC AGC GCC GCC GTT GCC 15. 
Tyr Ala Thr Pro Aia Asp Pro L/s Aid Leu Leu Ser Ala Aia Val Ala 
50 55 60 

ACT TCT CAA GGT GGA GGC CTA CCG G.^A TCA TTT ATG TCC CTG TGG 240 

Thr Ser Gin Giy Gly Gly Lys Leu Pro Giu Ser Phe Mec Ser Leu Trp 
65 70 75 30 

COT TTC CCG CAC ATG CTG GAA GCG CGC GGC ATG GTT AGC CAG CTC 23o 

Arg Phe Pro His Mec Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 

85 90 95 

ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC GCG 336 
Thr Gin Phe Giy Ser Thr Leu Gin Asn lie lie Glu Arg Gin Asp Ala 

100 105 110 

GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA TTG 384 
Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He Leu 
115 120 125 

ACT AAC CTG AGC ATT CAG GAC .\AA ACC ATT GAA GAA TTG GAT GCC GAG 432 
Thr Asn Leu Ser He Gin Asp Lys Thr lie Giu Giu Leu Asp Ala Glu 
130 135 140 

AAA ACG GTG TTG GAA AAA TCC PJkA GCG GGA GCA CAA TCG CGC TTT GAT 430 
Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Aia Gin Ser Arg Phe Asp 
145 150 155 160 

AGC TAC GCC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC CAA 528 

Ser Tyr Gly Lys Leu Tyr Asp Giu Asn lie Asn Aia Giy Glu Asn Gin 

165 170 175 

GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT CAG 576 

Ala Met Thr Leu Arg Ala Ser Ala Aia Giy Leu Thr Thr Aia Val Gin 

180 135 190 

GCA TCC COT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC TTC 624 

Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 
195 200 205 

GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG ACA 672 
Gly Phe Ala Gly Gly Gly Ser Arg Trp Giy Ala He Ala Giu Ala Thr 
210 215 220 

GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG GAT 720 
Gly Tyr Val Mec Glu Phe Ser Ala Asn Val Mec Asn Thr Glu Aia Asp 
225 230 235 240 

AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TCG GAG 7 68 
Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Giu Trp Giu 

245 250 255 

ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT CAG 816 
He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 

260 265 270 

CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA ACC 864 
Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Aia Val Leu Gin Lys Thr 
275 280 285 

AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC CTG 912 
Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Aia Phe Leu 
290 295 300 

CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT CGT CGA 5 60 
Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Giy Arg 

310 315 320 
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CT-3 OCG CCG ATT TAC TTC CAG TTC TAG GAT TTG CCC GTC CCG CGT TZC ;003 
Lnu Ala Ala lie T"/r Phe Gin Phe T/r Asp Leu Ala Val Ala Arg C/s 

325 330 235 

5 CTG ATC CCA GAA CAA GOT TAC CCT TCG GAA CTC AAT CAT GAC TCT CCC 1056 
LeU Mec Ala Olu Gin Ala T^t Arg Trp Glu Leu Asn Asp Asp Ser Ala 

340 345 350 

CCC TTC ATT AAA CCC CCC CCC TCG CAG GGA ACC TAT CCC GGT CTC CTT 1104 
10 Arg Phe lie Lys Pro Cly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 

355 360 365 

CCA GGT GAA ACC TTG ATC CTC ACT CTG CCA CAA ATG GAA GAC GCT CAT 1152 
Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Mec Glu Asp Ala His 
15 370 375 330 

CTG AAA CCC GAT AAA CCC CCA TTA GAG GTT GAA CCC ACA GTA TCG CTC 1200 
Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
385 390 395 400 



20 



40 



60 



CCC GAA GTT TAT CCA GGA TTA CCA AAA CAT AAC GGT CCA TTT TCC CTG 1248 
Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 

405 410 415 



25 CCT CAG GAA ATT GAC AAC CTC CTC ACT CAA GGT TCA CCC ACT CCC CCC 1296 
Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 

420 425 430 

ACT GGT AAT AAT AAT TTG CCG TTC GGC CCC GGC ACG GAC ACT AAA ACC 1344 
30 Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 

435 440 445 

TCT TTC CAG CCA TCA CTT TCA TTC GCT GAT TTC AAA ATT CCT GAA CAT 1392 
Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp 
35 450 455 460 

TAC CCG CCA TCG CTT CCC AAA ATT CCA CCT ATC AAA CAG ATC ACC CTC 1440 

Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser Val 
465 470 475 480 



ACT TTC CCC CCG CTA CTC GGA CCG TAT CAG GAT GTA CAG CCA ATA TTC 1488 
Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu 

485 490 495 



45 TCT TAC GGC GAT AAA CCC GGA TTA GCT AAC GGC TCT GAA CCG CTC CCA 1536 

Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 

500 505 510 

CTT TCT CAC GGT ATC AAT GAC ACC GGC CAA TTC CAG CTC GAT TTC AAC 1584 

50 Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 

515 520 525 

CAT GGC AAA TTC CTC CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC ACC 1632 

Asp Gly Lys Phe Leu Pro Phe ciu Gly He Ala He Asp Gin Gly Thr 
55 530 535 540 



CTC ACA CTC ACC TTC CCA AAT CCA TCT ATC CCG CAG AAA GGT AAA CAA 1680 
Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 
545 550 555 560 

GCC ACT ATC TTA AAA ACC CTC AAC GAT ATC ATT TTC CAT ATT CCC TAC 1728 

Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 

565 570 575 



65 ACC ATT AAA TAA 1740 
Thr He Lys ••• 
579 



70 (2) INFORMATION FOR SEQ ID NO: 51: 
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10 



25 



40 



55 



7() 



i i SEQUENCE CHAPJVCTERISTICS : 

(A) LEIIGTH: 5" 9 amino acids 

(B) TYPE: amino acids 

(C) STRA^JDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5L (TcdAiii): 

L©u Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He Asn 
i 5 10 15 



Glu Vai Mec Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 
15 20 25 30 

Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro He 
35 40 45 

20 T/r Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 
50 55 60 



Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 
65 70 75 80 

Arg Phe Pro His Mec Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 

85 90 95 



Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp Ala 
30 100 105 110 

Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He Leu 
115 120 125 

35 Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala Glu 
130 135 140 



Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 

145 150 155 160 

Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 

165 170 175 



Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 

45 180 185 190 

Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 

195 200 205 

50 Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 
210 215 220 



Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp 
225 230 235 240 

Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp Glu 

245 250 255 



He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 
60 260 265 270 

Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 
275 280 285 

65 Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 
290 295 300 



Gin Arg Lys Phe Ser Asn Gin Ala Leu r/r Asn Trp Leu Arg Gly Arg 
305 310 315 320 
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Ala Ala lie Tyr Phe Gin Ph« Tyr Asp L«u Ala Vai Aid Arg Cys 

325 330 3>S 

Lau Met Ala Giu Gin Aia T/r Arg Trp Giu Leu Asn Asp Asp Ser Ala 
5 340 345 350 

Ar9 Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Aia Giy Leu Leu 
355 360 365 



Ala Gly Giu Thr Leu Mec Leu Ser Leu Ala Gin Met Giu Asp Ala His 

370 375 380 



15 



Leu Lys Arg Asp Lys Arg Ala Leu Giu Val Giu Arg Thr Val Ser Leu 

335 390 395 400 

Ala Giu Val Tyr Aia Giy Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 

405 410 415 



Ala Gin Giu lie Asp Lys Leu Val Ser Gin Gly ser Gly Ser Ala Glv 
20 420 425 430 



25 



30 



40 



45 



55 



Ser Giy Asn Asn Asn Leu Ala Phe Giy Aia Giy Thr Asp Thr Lys Thr 
435 440 445 

Ser Leu Gin Aia Ser Vai Ser Phe Ala Asp Leu Lys He Arg Giu Asp 
450 455 460 

r/r Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser Val 
4o5 470 475 480 

Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala lie Leu 

485 490 495 



Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly cys Giu Ala Leu Aia 
35 500 505 510 



Val ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 
515 520 525 

Asp Gly Lys Phe Leu Pro Phe Giu Gly He Ala He Asp Gin Gly Thr 
530 535 540 

Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Giu Lys Gly Lys Gin 
545 550 555 560 

Aia Thr Mec Leu Lys Thr Leu Asn Asp lie He Leu His He Arg Tyr 

565 570 575 



Thr He Lys ••• 

50 579 



(21 INFORMATION FOR SEQ ID NO: 52: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5532 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : double 

(D) TOPOLOGY: linear 

(iii MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:52 iTcdAiii coding region): 



65 TTT ATA CAA GGT TAT AGT GAT CTG 
Phe He Gin Giy Tyr Ser Asp Leu 
i 5 

GCC GCC CCG GGC TCG GTT GCA TCG 



TTT GGT AAT CCT GCT GAT AJ^C TAT 48 
Phe Giy Asn Arg Aia Asp Asn Tyr 
10 15 

ATG TTC TCA CCG GCG GCT TAT TTG 96 
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Ma. Ala Pro Oly Ser Val Ala Sex Met Phe 5er Pro Aia Aid Tyr Leu 

20 25 30 

ACG GAA TTG TAC CGT CAA GCC A.^^ AAC TTG CAT GAC ACC AGC TCA ATT 144 

5 Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser lie 

35 40 45 

TAT TAC CTA GAT .VAA CGT CGC CCG CAT TTA GCA AGC TTA ATG CTC AGC 192 

T-yr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 

10 50 55 60 

CAG AAA .^AT .ATG CAT GAG GAA ATT TCA ACG CTG GCT CTC TCT AAT GAA 240 

Gin Lys Asn Met Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 
65 70 75 80 



15 



35 



55 



TTG TGC CTT GCC GGG ATC GAA ACA AAA ACA GGA AAA TCA CAA CAT CAA 283 

Leu cys Leu Ala Gly lie Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 

85 90 95 



20 GTG ATG GAT ATC TTG TCA ACT TAT CGT TTA ACT CCA GAG ACA CCT TAT 3 36 

Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 

100 105 110 

CAT CAC GCT TAT GAA ACT GTT CGT CAA ATC CTT CAT CAA CGT GAT CCA 384 

25 His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Arg Asp Pro 

115 120 125 

GGA TTT CGT CAT TTG TCA CAG CCA CCC ATT GTT CCT CCT AAC CTC CAT 432 

Gly Phe Arg His Leu ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 
30 130 135 140 

CCT GTG ACT TTG TTG CGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 480 

Pro Val Thr Leu Leu Gly He Ser Ser His He Ser Pro Glu Leu Tyr 

145 150 155 160 



AAC TTG CTG ATT GAG GAC ATC CCG CAA AAA GAT GAA GCC CCC CTT CAT 528 
Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 

165 170 175 



40 ACG CTT TAT AAA ACA AAC TTT GCC GAT ATT ACT ACT GCT CAG TTA ATG 576 

Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 

180 185 190 

TCC CCA ACT TAT CTG CCC CCG TAT TAT CGC CTC TCA CCG GAA GAT ATT 624 
45 Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He 

195 200 205 

GCC TAC GTG ACG ACT TCA TTA TCA CAT GTT GGA TAT AGC ACT GAT ATT 672 
Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 
50 210 215 220 

CTG GTT ATT CCG TTG CTC GAT GCT GTG GCT AAC ATC GAA CTA GTT CGT 720 
Leu Val He Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 
225 230 235 240 



GTT ACC CCA ACA CCA TCG GAT AAT TAT ACC ACT CAG ACG AAT TAT ATT 768 
Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 

245 250 255 



60 GAG CTG TAT CCA CAG GCT CGC GAC AAT TAT TTG ATC AAA TAC AAT CTA 816 
Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 

260 265 270 

AGC AAT ACT TTT GCT TTG GAT GAT TTT TAT CTG CAA TAT AAA GAT GGT 864 
65 Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 

275 280 285 

TCC GCT GAT TGC ACT GAC ATT GCC CAT AAT CCC TAT CCT CAT ATG CTC 912 
Ser Aia Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 
70 290 295 300 
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ATA AAT C.-J>i rJKG TAT G.-J\ TCA CAG GCC ACA ATC AAA CGT ACT GAC TtT ?dO 

lie Asn Gin Lys T/r Giu Ser Gin Ala Thr lie Lys Arg Ser Asp 3er 
?05 310 315 32C 

5 r-AC AAT ATA CTC ACT ATA CGG TTA CAA AGA TGG CAT AGC GGT ACT TAT 1003 

.-.sp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Cly Ser Tyr 

325 330 335 

TTT GCC GCC GCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 1056 

10 Asn Phe Ma Ala Ala Asn Phe Lys lie Asp Gin Tyr Ser Pro Lys Ala 

340 345 350 



40 



6U 



TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 
Phe Leu Leu Lys Mec Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 
15 355 360 365 



CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT ACT GTT AAT AGC ACC 1152 
Leu 3er Phe Ala Thr Leu Glu Arg He Vai Asp Ser Val Asn Ser Thr 
370 375 380 

AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 1200 
Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 
385 390 395 400 



25 TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 1248 
Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 

405 4X0 415 

i-AT ATT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 1296 
30 Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 

420 425 430 

GAG CAA CTA TTT AAT CAC CCG CCG CTC AAT GGT ATT CGC TAT GAA ATC 1344 
Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 
35 435 440 445 

AGT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT GAT CTG AAC CTT AAA 1392 
Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 
450 455 460 



CCA GAC ACT ACC GGT GAT GAT CAA CGC AAG GCG GTT TTA AAA CGC GCG 1440 

Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 
465 470 475 480 



45 TTT CAG GTT AAC GCC AGT GAG TTG TAT CAG ATG TTA TTG ATC ACT GAT 1488 

Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 

485 490 495 

CGT AAA GAA GAC GGT GTT ATC AAA AAT AAC TTA GAG AAT TTG TCT GAT 1536 

50 Arg Lys Glu Asp Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 

500 505 5X0 

CTG TAT TTG GTT AGT TTC CTG GCC CAG ATT CAT AAC CTG ACT ATT GCT 1534 

Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 
55 515 520 525 

GAA TTG AAC ATT TTG TTG GTG ATT TCT GGC TAT GGC CAC ACC AAC ATT 1632 

Glu Leu Asn He Leu Leu Val He Cys Gly Tyr Gly Asp Thr Asn He 
530 535 540 



TAT CAG ATT ACC GAC GAT AAT TTA GCC AAA ATA GTG GAA ACA TTG TTG 1680 
Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 

545 550 555 560 



65 TGG ATC ACT CAA TGG TTG AAG ACC CAA AAA TGG ACA GTT ACC GAC CTG 1723 

Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 

565 570 575 

TTT CTC ATG ACC ACG GCC ACT TAC AGC ACC ACT TTA ACG CCA CAA ATT 1T76 

70 Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 

580 585 590 
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AGC AAT CTG ACG OCT ACG TTG TCT TCA ACT TTG CAT GGC AAA GAG AGT ia:4 
ier Asn Leu Thr Aid Thr Leu 3er Ser Thr Leu His Giy Lys Giu 5er 
^ 595 600 605 

CTG ATT GGG GAA GAT CTG AAA AG A GCA ATG GCG CCT TGC TTC ACT TCG 137? 
Leu lie Gly Glu Asp Leu Lys Arg Aid Mac Ala Pro Cys Phe Thr Ser 
610 615 620 

10 GCT TTG CAT TTG ACT TCT CAA GAA GTT GCG TAT GAC CTG CTG TTG TGG 1920 
Aid Leu His Leu Thr Ser Gin Giu Vai Aid Tyr Asp Leu Leu Leu Trp 
625 630 635 640 

ATA GAC CAC ATT CAA CCG GCA CAA ATA ACT GTT GAT GGG TTT TGG CAA 1963 
15 lie Asp Gin lie Gin Pro Aid Gin lie Thr Vai Asp Gly Phe Trp Glu 

645 650 655 

GAA GTG CAA ACA ACA CCA ACC AGO TTG AAG GTG ATT ACC TTT GCT C^G ?016 
Glu Vai Gin Thr Thr Pro Thr Ser Leu Lys Vai lie Thr Phe Ala Gin " 
0 660 665 670 

GTG CTG GCA CAA TTG AGC CTG ATC TAT CGT CGT ATT GGG TTA AGT GAA 2064 
Vai Leu Ala Gin Leu Ser Leu lie Tyr Arg Arg lie Gly Leu Ser Giu 
^ 675 630 685 

ACG GAA CTG TCA CTG ATC GTG ACT CAA TCT TCT CTG CTA GTG GCA GGC •>112 
Thr Glu Leu Ser Leu lie Vai Thr Gin Ser Ser Leu Leu Vai Ala Gly 
690 695 700 

0 A.^ AGC ATA CTG GAT CAC GGT CTG TTA ACC CTG ATG GCC TTG GAA GGT 2160 
Lys Ser lie Leu Asp His Gly Leu Leu Thr Leu Met Aia Leu Glu Gly 
705 710 715 720 

TTT CAT ACC TGG GTT AAT GGC TTG GGG CAA CAT GCC TCC TTG ATA TTG 2208 
5 Phe His Thr Trp Vai Asn Gly Leu Giy Gin His Aia Ser Leu lie Leu 

725 730 • 735 

GCG GCG TTG AAA GAC GGA GCC TTG ACA GTT ACC CAT CTA GCA CAA GCT 2256 
Ala Ala Leu Lys Asp Giy Aia Leu Thr Vai Thr Asp Vai Aia Gin Aia 

740 745 750 

ATG AAT AAG GAG GAA TCT CTC CTA CAA ATG GCA GCT AAT CAC GTG GAG 2304 
Mec Asn Lys Giu Giu Ser Leu Leu Gin Mec Aia Aia Asn Gin Vai Glu 
755 760 765 

AAG GAT CTA ACA AAA CTG ACC AGT TGC ACA CAC ATT CAC GCT ATT CTG 23 5'> 
Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin lie Asp Aia lie Leu 
770 775 780 

CAA TGG TTA CAC ATG TCT TCG GCC TTG GCG GTT TCT CCA CTG GAT CTG 2400 
Gin Trp Leu Gin Met Ser Ser Aia Leu Aia Vai Ser Pro Leu Asp Leu 
785 790 795 800 

GCA GGG ATG ATG GCC CTG AAA TAT GGG ATA GAT CAT AAC TAT GCT GCC 2443 
Aid Gly Met Met Ala Leu Lys Tyr Gly lie Asp His Asn Ts'r Ala Ala 

805 8X0 315 

TGG CAA GCT GCG GCG GCT GCG CTG ATG GCT GAT CAT GCT AAT CAG GCA 2496 
Trp Gin Aia Aia Aia Aia Aia Leu Met Aia Asp His Aia Asn Gin Aia 

820 825 830 

CAG AAA AAA CTG GAT GAG ACG TTC AGT AAG GCA TTA TGT AAC TAT T^T 2544 

Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Aia Leu Cys Asn Tyr Tyr 
835 840 845 

ATT AAT GCT GTT GTC GAT AGT GCT GCT GGA CTA CGT GAT CGT AAC GGT 2592 
lie Asn Aia Vai Vai Asp Ser Aia Aia Giy Vai Arg Asp Arg Asn Gly 
350 855 860 

TTA TAT ACC TAT TTG CTG ATT GAT AAT CAG GTT TCT GCC GAT GTG ATC 2640 
Leu Tyr Thr Tyr Leu Leu lie Asp Asn Gin Vai Ser Aia Asp Vai lie 
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10 



30 



50 



70 



365 370 375 3o0 

ACT TCA CZr ATT CCA G?J< GCT ATC GCC CGT ATT CrA CTG TAC GTT .=uAC 2633 
Thr Ser Arg Ii« Ala Glu Aid lie Ala Ciy lie Gin L«u T/r Val Asn 

885 890 395 

CGG GCT TTA AAC CGA GAT GAA GGT CAG CTT GCA TCG GAC GTT AGT ACC 27 36 
Arg Ala Leu Asn Arg Asp Glu Giy Gin Leu Ala Ser Asp vai ser Thr 

900 905 910 

CGT CAG TTC TTC ACT GAC TCG GAA CGT TAC AAT AAA CGT TAC AGT ACT 2734 

Arg Gin Phe Phe Thr Asp Trp Glu Arg T/r Asn Lys Arg Tyr Ser Thr 
915 920 925 



15 TGG GCT GGT GTC TCT GAA CTG GTC TAT TAT CCA GAA AAC TAT GTT GAT 2332 
Trp Ala Giy Val Ser Giu Leu Vai Tyr Tyr Pro Glu Asn Tyr Vai Asp 
930 935 940 

CCC ACT CAG CGC ATT GGG CAA ACC AAA ATG ATG GAT GCG CTG TTG CAA 2380 
20 Pro Thr Gin Arg lie Giy Gin Thr Lys Mec Met Asp Ala Leu Leu Gin 
945 950 955 960 

TCC ATC AAC CAG AGC CAG CTA AAT GCG GAT ACG GTG GAA GAT GCT TTC 2923 
Ser lie Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 
25 965 970 975 

AAA ACT TAT TTG ACC AGC TTT GAG CAG GTA GCA AAT CTG AAA GTA ATT 2976 
Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val lie 

980 985 990 



AGT GCT TAC CAC GAT AAT GTC AAT GTG GAT CAA GGA TTA ACT TAT TTT 3024 
Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Giy Leu Thr T^t Phe 
995 1000 1005 



35 ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072 

lie Giy lie Asp Gin Ala Ala Pro Giy Thr Tyr Tyr Trp Arg Ser Val 
1010 1015 1020 

GAT CAC AGC AAA TGT GAA AAT GOC AAG TTT CCC GCT AAT GCT TGG GGT 3120 
40 Asp His Ser Lys Cys Glu Asn Giy Lys Phe Ala Ala Asn Ala Trp Giy 
1025 1030 1035 1040 

GAG TGG AAT AAA ATT ACC TGT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3168 
Giu Trp Asn Lys lie Thr Cys Ala Val Asn Pro Trp Lys Asn lie He 
45 1045 1050 1055 

CGT CCG GTT GTT TAT ATG TCC CGC TTA TAT CTG CTA TGG CTG GAG CAG 3216 
Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 

1060 1065 1070 



CAA TCA AAC AAA ACT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 3264 

Gin Ser Lys Lys Ser Asp Asp Giy Lys Thr Thr lie Tyr Gin T/r Asn 
1075 1080 1085 



55 TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT ACA CCA TTT 3 312 
Leu Lys Leu Ala His He Arg Tyr Asp Giy Ser Trp Asn Thr Pro Phe 
1090 1095 1100 

ACT TTT GAT GTG ACA GAA AAC GTA AAA AAT TAC ACG TCG AGT ACT GAT 33 60 

60 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 
1105 1110 1115 1120 

GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 3403 
Ala Ala Glu Ser Leu Giy Leu Tyr Cys Thr Giy Tyr Gin Giy Glu Asp 
65 1125 1130 1135 

ACT CTA TTA GTT ATG TTC TAT TCC ATG CAG AGT AGT TAT AGC TCC TAT 3456 
Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 

1140 1145 1150 



ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3 504 
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Thr Asp Asn Asn Ala Pro Val Thr Cly Leu r/r lie Phs Ala A»p Met 
liSS lioO 1165 

TCA TCA CAC AAT AT3 ACG AAT GCA CXX GCA ACT .A.=.C TAT TCG A.AT .^-^iC 3 55: 
Ser Ser Asp Asn Met Thr Asn AIa Gin Ala Thr Asn T/r Trp Asn Asn 
1170 1175 1130 

ACT TAT CCG CAJ^ TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC .V\T 3600 
3er Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 
113S 1190 1195 1200 

AAA .A.=iA GTC ATA ACC AG A AG A GTT AAT AAC CCT TAT GCG GAG GAT TAT 3 643 

L/s Lys Vdl lie Thr Arg Arg VaI Asn Asn Arg Tyr Ala Glu Asp Tyr 

1205 1210 1215 

G^Ji ATT CCT TCC TCT GTG AC A AGT AAC ACT AAT TAT TCT TGG GGT GAT 3696 
Glu lie Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Cly Asp 

1220 1225 1230 

CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 3744 
His Ser Leu Thr Mec Leu Tyr Gly Gly Ser Val Pro Asn lie Thr Phe 
1235 1240 1245 

GAA TCG GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT 3792 
Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 
1250 1255 1260 

ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TGT AAT CTT 3840 
lie lie His Asn Giy Tyr Ala Giy Thr Arg Arg lie Gin Cys Asn Leu 
1265 1270 1275 1280 

ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 3338 
Met Lys Gin Tyr Ala Ser Leu Giy Asp Lys Phe lie lie Tyr Asp Ser 

1285 1290 1295 

TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3936 
Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 

1300 1305 1310 

GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 
Gly Lys Asp Glu Asn Ser Asp Asp Ser lie Cys lie Tyr Asn Glu Asn 
1315 1320 1325 

CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4032 
Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe ser Ser Lys Asp Asp Asn 
1330 1335 1340 

AAA AC A GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4080 
Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys lie Asp Ala Giy Thr 
A345 1350 1355 1360 

AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 4128 
Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu lie Glu Val lie Ser 

1365 1370 1375 

GTT ACT GGT GGG TAT TGG TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 4176 
Val Thr Cly Gly Tyr Trp Ser Ser Tyr Lys lie Ser Asn Pro lie Asn 

1380 1385 1390 

ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 4224 
lie Asn Thr Gly lie Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 
1395 1400 1405 

GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4272 
Gly Cly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Vai Pro 
1410 1415 1420 

CAG CAA CCG GCA CCC AGT TTT CAC CAC ATG ATT TAT CAG TTC AAT 4320 
Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 
^•125 1430 1435 1440 

«« *• £ 
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•-TG AC A ATA GAT TGT .=^^0 AAT TTA A.^T TTC ATC GAC .^-AT CAG GCA CAT 4 -co 

L«u Thr He Asp Cys Lys Asn Lcu Asn Phe lie Asp Asn Gin Ala His 

1445 i450 1455 

5 ATT GAG ATT GAT TTC ACC GCT ACG CCA GAT GGC CGA TTC TTG GGT 44 io 

lie Glu He Asp Phe Thr Ala Thr Ala Gin Asp Giy Arg Phe Leu Giy 

1460 1465 1470 

GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4464 
10 ^la Glu Thr Phe He He Pro Vai Thr Lys Lys Val Leu Giy Thr Glu 

1475 1480 1485 

AAC GTG ATT GCG TTA TAT AGC CAA .AAT AAC GGT GTT CAA TAT ATG CAA 4512 
\sn Val lie Ala Leu Tyr Ser Glu Asn Asn Giy Val Gin T-yr Met Gin 
15 1490 1495 1500 

ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 4560 
lie Giy Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
1505 1510 1515 1520 



20 



40 



GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC ACT ATG GAA ACT 4603 

Vai Ser Arg Ala Asn Arg Giy He Asp Ala Val Leu Ser Met Glu Thr 

1525 1530 1535 



"^5 CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC AC A TAT GTG CAG CTT 4656 

Gin Asn He Gin Glu Pro Gin Leu Giy Ala Giy Thr Tyr Vai Gin Leu 

1540 1545 1550 

GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 4704 
30 Val Leu Asp Lys Tyr Asp Glu Ser He His Giy Thr Asn Lys Ser Phe 

1555 1560 1565 

GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT ACT TTT GTG ATT 4752 
Ala He Glu Tyr Val Asp lie Phe Lys Glu Asn Asp Ser Phe Val He 
35 1570 1575 1580 

TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4300 
Tyr Gin Giy Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
1585 1590 1595 1600 

TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TGG GTA 4843 

Leu Ser Tyr Phe He Glu Ala Thr Giy Asn Lys Asn His Leu Trp Val 

1605 1610 1615 

45 CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT AAG ATC TTG TTC GAC CGT 4396 
Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arg 

1620 1625 1630 

ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 4944 

50 Thr Asp Glu Lys Asp Pro His Giy Trp Phe Leu Ser Asp Asp His Lys 

1635 1640 1645 

ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 4992 
Thr Phe Ser Giy Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Giu 
55 1650 1655 1660 

CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 5040 
Pro Met Asp Phe Ser Giy Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 
1665 1670 1675 1680 

^ TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTC CAG GAA CAG AAT 5088 
Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Giu Gin Asn 

1685 1690 1695 

65 TTT GAT CCG GCG AAC CAT TGG TTC CGT TAT CTC TGG AGT CCA TCC GGT 5136 
Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Giy 

1700 1705 1710 

TAT ^^TC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5184 
70 T/r He Val Asp Giy Lys He Ala He Tyr His Trp Asn Val Arg Pro 

1715 1720 1725 
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CTG GAA GAA GAC ACC AGT TCG AAT GCA CAA CAA CTC GAC TCC A^C G.nT i..- 
L*u Glu Clu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 
1730 1735 1740 

CC\ GAT OCT GTA GCC CAA GAT GAT CCG ATG CAC TAG AAG GTG GCT ACC 5230 
Pro ASP Ala Vai Ala Gin Asp Asp Pro Mgc His Tyr Lys Val Ala Thr 
X745 1"50 1755 i'oO 

TTT ATG CCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 53 23 
Phe Mec Ala Thr Lsu Asp Leu Leu Met Aia Arg Giy Asp Aia Aia T/r 

1765 1770 1775 

CGC CAG TTA GAG CGT GAT ACG TTG GCT CAA GCT AAA ATG TGG TAT AC A 537 6 
\rQ Gin Leu Giu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp T/r Thr 
' 1780 1785 1790 

CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 5424 
Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 
1795 1800 1805 

\Cr TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAC 5472 
Thr Trp Ala Asn Pro Thr Leu Giy Asn Ala Ala Ser Lys Thr Thr Gin 
1310 1815 1820 

CiG GTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5520 
Gin Val Arg Gin Gin Vai Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1330 1335 1340 

AAA ACC CCG TTG 5532 
Lys Thr Pro Leu 

1844 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1B44 amino acids 

(B) TYPE: amino acids 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:53 (TcbAii): 
Features From To Description 

Peptide 1 1844 TcbAii peptide 

Fragment 1 11 (SEQ ID NO: 1) 

Fragment 978 990 (SEQ ID NO:23) 

Fragment 1387 1401 (SEQ ID NO: 22) 

Fragment 1484 1505 <SEO ID NO: 24) 

Fragment 1527 1552 <SEQ ID N0:21) 

Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 
1 5 10 15 

Aia Ala Pro Gly Ser Val Aia Ser Met Phe Ser Pro Ala Ala Tyr Leu 

20 25 30 

Thr Giu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser ser ser He 
35 40 45 

T/r Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 
50 55 60 

Gin Lys Asn Met Asp Giu Glu lie Ser Thr Leu Ala Leu Ser Asn Clu 
65 70 30 

Leu cys Leu Ala Gly He Giu Thr Lys Thr Gly Lys Ser Gin Asp Giu 
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35 90 35 

Vai Met Asp Met LoU S«r Thr Tyr Arg Leu Ser Oly Giu Thr Pro Tyr 

iOO 105 1X0 

5 

His His Ala Tyr Giu Thr Vai Arg Glu Il« Val His Giu Arg Asp Pro 
lis 120 125 

Gly Phe Arg His Leu Ser Gin Ala Pro lie Vai Ala Ala Lys Leu Asp 
10 130 135 140 

Pro Vai Thr Leu Leu Gly He Ser Ser His He Ser Pro Glu Leu Tyr 
145 150 155 160 

15 Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 

165 170 175 



20 



35 



50 



65 



Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 

180 185 190 

Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Vai Ser Pro Glu Asp He 
195 200 205 



Ala Tyr Vai Thr Thr Ser Leu Ser His Vai Gly Tyr Ser Ser Asp He 
25 210 215 220 

Leu Vai He Pro Leu Vai Asp Gly Vai Gly Lys Met Glu Vai Vai Arg 

225 230 235 240 

30 Vai Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 

245 250 255 



Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 

260 265 270 

Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr *Leu Gin Tyr Lys Asp Gly 
275 280 285 



Ser Ala Asp Trp Thr Giu He Ala His Asn Pro Tyr Pro Asp Met Vai 
40 290 295 300 

He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arg Ser Asp Ser 
305 310 315 320 

45 Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 

325 330 335 



Asn Phe Ala Ala Ala Asn Phe Lys He Asp Gin Tyr Ser Pro Lys Ala 

340 345 350 

Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 
355 360 365 



Leu Ser Phe Ala Thr Leu Glu Arg He Vai Asp Ser Vai Asn Ser Thr 

55 370 375 330 

Lys Ser He Thr Vai Glu Vai Leu Asn Lys Vai Tyr Arg Val Lys Phe 

385 390 395 400 

60 Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 

405 410 415 



Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 

420 425 430 

Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 
435 440 445 



Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 
70 450 455 460 
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pro ASP ser Thr =1/ Asp Asp oln Arg Lys AU V.l L.u Lys Arg AU 

Phi cm val Asn Ala Ser Glu L*u ryr Cln M.c L*u L*u U* Thr Asp 

485 

Arg Lys Glu ASP Gly Val lU Lys Asn Asn L*u Glu Asn Leu Ser Asp 

500 

I,) Leu r,r Leu Val S« Leu Leu Ala Gin He His Asn Leu Thr lie Ala 



515 



Clu Leu Asn He Leu Leu Val He Cys Gly T/r Gly Asp Thr Asn He 



IS 



530 



535 



Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 
545 550 

Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 

20 

Phe Leu Met Thr Thr Ala Thr TVr Ser Thr Thr Leu Thr Pro Glu He 

5B0 

^5 ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 

595 oOO "^-^ 

Leu He Gly Glu Asp Leu Lys Arg Ala Mec Ala Pro cys Phe Thr Ser 



30 



610 



35 



Ala L.U HIS L.U Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 
625 630 = 

He ASP Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 

645 

Clu val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 

660 

40 val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 

675 

Thr Glu Leu ser Leu He val Thr Gin Ser Ser Leu Leu Val Ala Gly 
690 

Lys ser He Leu Asp His Gly Leu Leu Thr Leu Mec Ala Leu Glu Gly 
705 710 715 

Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala ser Leu lie Leu 
50 725 

Ala Ala L.U Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 

740 745 

55 Mec Asn Lys Clu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 

755 760 '^^ 



45 



6U 



Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 
' 770 775 780 

Gin Trp Leu Gin Met Ser ser Ala Leu Ala Val Ser Pro Leu Asp Leu 
735 790 795 

Ala Gly Met Mec Ala Leu Lys Tyr Gly He Asp Hxs Asn Tyv Ala Ala 

Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asp His Ala Asn Gin AU 

820 ^25 

70 Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu cys Asn T/r Tyr 

335 840 8«5 
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Ml 



25 



40 



55 



70 



lie .nsn Ala Val Val Asp Ser Ala Ala Gly Vai Arg Asp Arg Asn Ziy 
350 855 360 

Ldu T/r Thr T/r Lau Leu lie Asp Asn Gin Vai Ser Aia Asp Vai lie 
365 370 375 330 

Thr Ser Arg lie Ala Glu Ala lie Aia Gly lie Gin Leu Tyr Vai Asn 

385 890 395 

Arg Aia Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Vai Ser Thr 

900 905 910 



Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 
15 91S 920 925 

Trp Ala Gly Vai Ser Glu Leu Vai Tyr Tyr Pro Glu Asn Tyr Vai Asp 
930 935 940 

20 Pro Thr Gin Arg lie Gly Gin Thr Lys Mec Mec Asp Ala Leu Leu Gin 
945 950 955 960 



Ser lie Asn Gin Ser Gin Leu Asn Ala Asp Thr Vai Glu Asp Ala Phe 

965 970 975 

Lys Thr Tyr Leu Thr Ser Phe Glu Gin Vai Ala Asn Leu Lys Vai lie 

980 985 990 



Ser Ala Tyr His Asp Asn Vai Asn Vai Asp Gin Gly Leu Thr Tyr Phe 
30 995 1000 1005 

lie Gly He Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 

1010 1015 1020 

35 Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 
1025 1030 1035 1040 



Glu Trp Asn Lys He Thr cys Ala Val Asn Pro Trp Lys Asn He He 

1045 1050 1055 

Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 

1060 1065 1070 



Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr He Tyr Gin Tyr Asn 
45 1075 1080 1085 

Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 
1090 1095 1100 

50 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 
1105 mo 1115 1120 



Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 

1125 1130 1135 

Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 

1140 1145 1150 



Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Mec 
6(1 1155 1160 1165 

Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn A.sn 
1170 1175 1180 

65 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 
1135 1190 1195 1200 



Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 

i205 1210 1215 

Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 
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1220 1225 1230 

His Ser Leu Thr Met Leu r/r Gly Gly Ser Vai Pro Asn lie Thr Phe 
1235 1240 1245 

5 

Clu Ser Ala Ala Clu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 
1250 1255 1260 

lie lie His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 
10 1265 1270 1275 1230 

Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe He He Tyr Asp Ser 

1285 1290 1295 

15 ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 

1300 1305 1310 



20 



35 



50 



65 



Gly Lys Asp Glu Asn Ser Asp Asp Ser He c>'s He Tyr Asn clu Asn 
1315 1320 1325 

Pro ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 
1330 1335 1340 



Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys He Asp Ala Gly Thr 
25 1345 1350 1355 1360 

Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 

1365 1370 1375 

30 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys He Ser Asn Pro He Asn 

1380 1385 1390 



He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 
1395 1400 1405 

Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 
1410 1415 1420 



Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 
40 1425 1430 1435 1440 

Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 

1445 1450 1455 

45 He Glu He Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 

1460 1465 1470 



Ala Glu Thr Phe He He Pro Vai Thr Lys Lys Val Leu Gly Thr Glu 
1475 1480 1485 

Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 

1490 1495 1500 



He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
55 1505 1510 1515 1520 

Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Met Glu Thr 

1525 1530 1535 

W) Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 

1540 1545 1550 



Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 
1555 1560 1565 

Ala He Glu Tyr Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He 
1570 1575 1580 



Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
70 1535 1590 1595 1600 
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Lau 3er T/r Ph« He Glu Aia Thr Giy Asn Lys Asn His Lsu Trp Val 

1605 ioiO 1615 

Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys II* Leu Phe Asp Arg 

5 1620 1625 1630 

Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 

1635 1640 1645 

10 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 

1650 1655 1660 



15 



30 



45 



65 



Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 
1665 1670 1675 1630 

Tyr Ti-r Thr Pro Mec Met Mec Ala His Arg Leu Leu Gin Glu Gin Asn 

1685 1690 1695 



Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Vai Trp Ser Pro Ser Gly 
20 1700 1705 1710 

Tyr He Val Asp Gly Lys He Aia He Tyr His Trp Asn Val Arg Pro 
1715 1720 1725 

25 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 

1730 1735 1740 



Pro Asp Ala Val Ala Gin Asp Asp Pro Mec His T/r Lys Val Ala Thr 
1745 1750 1755 1760 

Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 

1765 1770 1775 



Arg Gin Leu Glu Arg Asp Thr Leu Aia Glu Ala Lys Met Trp Tyr Thr 
35 1780 1785 1790 

Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 
1795 1800 1805 

40 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 
1810 1815 1820 



Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1830 1835 1840 

Lys Thr Pro Leu 

1844 



50 (2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 

M) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 (TcbAiii coding region 



CTA CGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG GAG GAA AAT 43 

Leu Gly Thr Aia Asn Ser Lau Thr Ala Leu Phe Leu Pro Gin Glu Asn 
i 5 10 15 

AGO AAG CTC AAA GGC TAG TGG CCG ACA CTG GCG CAG CGT ATG TTT AAT 96 

Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Mec Phe Asn 

20 2S 30 
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TTA CGT CAT AAT CTG TCG ATT GAC GCC CAG CC3 CTC TCC TTG CCG CTG 144 

Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu 
35 40 45 

TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA IjZ 
Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 
50 55 60 

GCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 240 
Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His 
65 70 75 30 

CGC TTC CCT CAA ATG CTA GAA GGG GCA CGC GGC TTG GTT AAC CAG CTT 233 
Arg Phe Pro Gin Met Leu Glu Gly Aia Arg Gly Leu Val Asn Gin Leu 

85 90 95 

ATA CAG TTC GGT AGT TCA CTA TTG GGG TAG AGT GAC CCT CAC CAT GCG 356 
He Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 

100 105 110 

GAA GCT ATG AGT CAA CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG 334 

Glu Ala Mec Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 
115 120 125 

ACC AGT ATT CGT ATG CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA 432 

Thr Ser He Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 
130 135 140 

AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CCG TTT GAC 480 
Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 
145 150 155 160 

AGC TAT AGC CAA CTG TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA 528 

Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg 

165 170 175 

« 

GCG CTG GCC TTA CGC TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG 57 6 
Ala Leu Ala Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin 

180 185 190 

ATT TCC CGT ATG GCA GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC 624 

He ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 
195 200 205 

GCC CTG GCT GAT GCC GCC ATG CAT TAT GCT GCT ATT GCC TAT GCC ATC 672 
Gly Leu Ala Asp Gly Cly Met His Tyr Gly Ala He Ala Tyr Ala He 
210 215 220 

GCT GAC GGT ATT GAG TTG ACT GCT TCT GCC AAG ATG GTT GAT GCG GAG 720 
Ala Asp Gly He Glu Leu Ser Ala Ser Aia Lys Met Val Asp Ala Glu 
225 230 235 240 

AAA GTT GCT CAG TCG GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA 7 68 

Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 

245 250 255 

ATT CAG CGT GAC AAC GCA CAA GCG GAG ATT .^C CAG TTA AAC GCG CAA 316 
He Gin Arg Asp Asn Aia Gin Aia Glu He Asn Gin Leu Asn Aia Gin 

260 265 270 

CTG GAA TCA CTG TCT ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG 864 
Leu Glu Ser Leu Ser He Arg Arg Glu Aia Aia Glu Met Gin Lys Glu 
275 280 285 

TAC CTG AAA ACC CAG CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA 912 

Tyr Leu Lys Thr Gin Gin Ala Gin Aia Gin Aia Gin Leu Thr Phe Leu 
290 295 300 

ACA AGC AAA TTC AGT AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT 960 

Arg Ser Lys Phe Ser Asn Gin Aia Leu Tyr ser Trp Leu Arg Gly Arg 
305 310 315 320 
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IS 



30 



35 



45 



50 



55 



TTC TTA 3CT ATT TAT TTC CAG TTC TAT GAC TTG GCC CTA TCA CGT TZC 1GJ3 
Leu Ser Gly He r/r Phe Gin Phe Tyr Asp Leu Ala Val ser Arg c^s 

325 330 335 

TTG ATG CCA GAG CAA TCC TAT CAA TGG GAA OCT AAT GAT AAT TCC \TT lo-^^ 
Leu Met Ala Glu Gin Ser lyr Gin Trp Glu Ala Asn Asp Asn 3« 

340 345 350 

ACC TTT GTC AAA CCC GOT GCA TGG CAA CGA ACT TAG GCC GGC TTA TTG ilOJ 
?er Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala cly Hi IIS ' 
JSa 360 365 

TGT GGA GAA GCT TTG ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT 1152 
cys Gly Glu Ala Leu He Gin Asn Leu Ala Gin Mec Glu Glu Ala T^r 

fIJ: fr^ V:?. IS! E^a gta gaa cgc acg gtt TCA rrc 1200 



Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
*u Jd^ 390 395 



400 

GCA GTG GTT TAT GAT TCA CTG GAA GGT AAT CAT CGT TTT AAT TTA GCC I5dfl 
Ala Val val Tyr Asp Ser Leu Glu Gly Asn Asp Arg PlU Ala 

25 415 

rlt 7^ ^^"^ ^ GCA GGA ACT 1296 

Glu Gin He Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 

*20 425 430 

AAA GAA AAT GGG TTA TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GT^ 1344 
Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 
435 440 445 

AAA TTC TCC GAC TTC AAA CTG GGA ACC GAT TAT CCA GAC AGT ATC GTT 1392 
L/s Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp ser He Val 
450 455 



^3 sf^ ^ ^ GTT TCG CTA CCT 1440 

Gly ser Asn Lys Val Arg Arg He Lys Gin He Ser Val Ser Leu Pro 
W 465 470 475 

7^. ^ ^^'^ I" ^^'^ ^'^ ATG CTC ACC TAT GGT 1488 

Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Mec Leu Ser Tyr Gly 

485 490 495 

G?5 Q^I inl TTG CCG AAA GGT TGT TCA GCC TTG GCT GTG TCT CAT 1536 

Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 

500 505 510 

GGT ACC AAT GAT AGT GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA 1584 
Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 
515 520 525 

TAG CTG CCA TTT GAA GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT 1632 
Tyr Leu Pro Phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn 

535 540 

CTT CAA TTT CCG AAT GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT 1680 

AO c!c '^^^ Gin Lys Ala He Leu Gin Thr 

W 545 550 555 560 

ATG AGO GAT ATT ATT TTG CAT ATT CGT TAT ACC ATC CGT TAA 1722 
Met Ser Asp He He Leu His He Arg Tyr Thr He Arg ••• 

65 ^'^^ ^'^^ 



(2) INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 amino acids 

(B) TYPE: amino acids 
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(C) STPANDEDNESS : single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAiii): 

Leu Gly Thr Ala Asn set Leu Thr Ala Leu Phe Leu Pro Gin GIu Asn 
10 15 10 15 

Ser Lys Leu Lys Gly T^-r Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 

20 25 30 

15 Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu 

35 40 45 



20 



35 



50 



A5 



Tyr Ala Lys Pro Ala Asp Pro L^s Ala Leu Leu Ser Ala Ala Val Ser 

50 55 60 

Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His 

65 70 75 80 



Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 
25 85 90 95 

lie Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 

100 105 110 

30 Glu Ala Mec Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 

115 120 125 



Thr Ser lie Arg Met: Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 
130 135 140 

Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 
145 150 155 160 



Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg 
40 165 170 175 

Ala Leu Ala Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin 

180 185 190 

45 He Ser Arg Met Ala Gly Ala Gly Val Asp Mec Ala Pro Asn He Phe 

195 200 205 



Gly Leu Ala Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala He 
210 215 220 

Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Mec Val Asp Ala Glu 
225 230 235 240 



Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 
55 245 250 255 

He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 

260 265 270 

60 Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Mec Gin Lys Glu 

275 280 285 



T/r Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 
290 295 300 

Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 

305 310 315 320 



Leu Ser Gly He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 

70 325 330 335 
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L«u Met Ala Glu Gin Ser T/r Gin Trp Clu Ala Asn Asp Asn Ser lie 

340 345 350 

S Ser Pho Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Lau L«u 

355 360 365 



10 



25 



40 



Cys Gly Glu Ala Lau He Gin Asn Leu Ala Gin Met Glu Glu Ala T^t 
370 375 330 

Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
335 390 3S5 400 



Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala 
IS 405 410 415 

Glu Gin He Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 

420 425 430 

20 Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 

435 440 445 



Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val 

450 455 460 

Gly Ser Asn Lys Val Arg Arg He Lys Gin He Ser Val Ser Leu Pro 

465 470 475 480 



Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Mec Leu ser Tyr Gly 
30 485 490 495 

Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val ser His 

500 505 5X0 

35 Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 

515 520 525 



Tyr Leu Pro Phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn 
530 535 540 

Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr 

545 550 555 560 



• • • 



Mec Ser Asp He Ha Leu His He Arg Tyr Thr He Arg 
45 565 570 573 

(2) INFORMATION FOR SEQ ID NO: 56 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 2898 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 (CCCA) 

60 1 ATG AAT CAA CTC CCC ACT CCC CTC ATT TCC CGC ACC GAA GAG ATC CAC 43 
1 Mec Asn Gin Leu Ala Ser Pro Leu He Ser Arg Thr Glu Glu He His 16 

49 A.^C TTA CCC GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT bo 
65 17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 12 

97 GTG GTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 
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3 Val Val Arg Met Pro Arg Glu Arg Phe lie Arg Glu His Arg Ala Asp 4i 



145 CTC CGC CCC ACT GCT GAA AAA ATG TAT GAC CTG GCA GTG GGC TAT OCT iS2 
5 11 Leu Cly Arg Ser Aia Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala o4 



193 CXT CAG GTC TTA CAC CAT TTT CGC CCT K\T TCT CTT ACT GAA GCT GTT 240 

65 His Gil val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 30 

10 

241 CAG TTT GGC TTC AGA ACT CCC TTC TCC GTA TCA GGC CCG GAT TAC CCC 233 

31 Gin Phe Cly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 9o 

2fl9 AAT CAC TTT CTT CAT GCA AAC ACG GOT TOG AAA GAT AAA GCA CCA ACT 3 36 

97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro s*r lU 

20 337 GGA TCA CCG GAA CCC AAT GAT CCC CCG GTA GCC TAT CTG ACT CAT ATT 334 

113 Gly ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 123 

335 TAT CAA TTG GCC CTT GAA CAG GAA AAC AAT GGC GCC ACT ACC ATT ATG 432 

25 129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 144 



30 



35 



50 



55 



433 AAT ACC CTC CCG GAG CCT CGC CCC GAT CTG GCT GCT TTG TTA ATT AAT 430 

i45 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 160 

481 GAT AAA GCA ATC AAT GAG GTG ATA CCG CAA TTG CAG TTG GTC AAT GAA 528 

161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 17o 

529 ATT CTC TCC AAA GCT ATT CAG AAO AAA CTG ACT TTG ACT GAT CTG GAA 576 

177 He Leu ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 192 



40 577 GCG GTA AAC GCC AGA CTT TCC ACT ACC CCT TAC CCC AAT AAT CTG CCG 624 

1S3 Ala val Asn Ala Arg Leu Ser Thr Thr Arg Tyr Pro Asn Asn Leu Pro 208 

625 TAT CAT TAT GCT CAT CAG CAG ATT CAG ACA GCT CAA TCG GTA TTC GGT 672 

45 209 Tyr His Tyr Gly His Gin cln He Gin Thr Ala Gin Ser Val L«u Gl/ 



673 ACT ACC TTC CAA CAT ATC ACT TTC CCA CAG ACG CTG GAT CTG CCG CAA 720 

225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 240 

721 AAC TTC TCG GCA ACA GCA AAA CCA AAA CTC ACC CAT ACG ACT GCC ACT 7 53 

241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala :.er -5o 

7 69 GCT TTC ACC CGA CTC CAA ATC ATC GCG AGT CAG TTT TCG CCA GAG CAG 316 

257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 



60 817 CAC AAA ATC ATT ACC GAG ACT GTC GGT CAG GAT TTC TAT CAG CTT AAC 364 

273 Gin Lys He He Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn -od 

365 TAT CCT CAC ACT TCC CTT ACT GTC AAT AGT TTC ACC GAC ATG ACC ATA 312 

65 289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Met Thr He 
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^r^ mI^ ^^"^ "'''^ ^"^^ =TA GA>. CTC ATJ TT: --io 

30^ Met Thr Asp Arg Thr Ssr Leu Thr Val Pro Gin Vai ciu Leu Msc Leu ' ' 



• mm 



9 = 1 TGT TCA ACT GTC CCA OCT TCT ACG CTT GTT AAC TCT GAT AAT GTG I'T i.-s 
Cys ST Thr Vai cly Gly s.r Thr Val Val Ly. Sr Sp ^ll V^J 

iu09 TCT GOT GAC ACQ ACA GCG ACC CCA TTT CCC TAT CGC CCC CGC TTT ^TT iC5^ 

o« Gly ASP Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe ill Isl 

^111 Sf^ 5?"^ ^ °^ ''^^ ACT CGC ACT GGT GCG GAG 11C4 

3:.3 His Ala Gly Lys Pro Glu Ala He Thr Leu Ser Ara Ser Gly Ala gTu led 

5^"^ IT '^^^ ^'^ -"^^ '^'^ ACA GAT GAC AAG TTC Ci" 

3o9 Ala Hxs Phe Ala Leu Thr Val Asn Asn Lau Thr Asp Asp Lys Leu Asp 3^4 

1153 CGT ATT AAC CCC ACA GTG CGC CTO CAA AAA TOS CTC AAT CTC CCT TAT 

.35 Arg He Asn Arg Thr Val .^g Leu Gin Lys Trp Leu Asn Leu Pro lyr ioS 

'III ctu A=I rTI »f f,^ °'='^ ^'^T GCG G.^A ACA CGA 1243 

401 Glu Asp He Asp Leu Leu Val Thr Ser Ala Mec Asp Ala Glu Thr Gly iu 

^lll t^*^ ACG CTC CGT ATC TK GGA GTC 12 

417 Asn Thr Ala Leu Ser Mec Asn Asp Asn Thr Leu Arg Mec Leu Gly Val 43 



96 
2 



^aV^ 13^ ^ ^^"^ '^^'^ AAG TAT CGT GTT AGC GCT AAA CAA TIT GCT '344 

433 Phe Lys His T/r Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe All 443 

^Itl ^"^ °'TA CCG TTT GCC ATT ACA CCG CCA ACG CCG 139- 

449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro Je!' 



13 93 TTT TTA GAC CAA CTO TTT AAC TCC CTC CGC ACC TTT GAT ACA CCG TIT 14 
4o5 Phe Leu Asp Gin Val Phe Asn Ser val Gly Thr Phe Asp Thr Pro Phe 



40 
430 



^it\ SI? tT^ ^^"^ ^^"^ ^"^^ "^AT ACA TTC ACC ACC GGG GGC GAT 1433 

481 Val He Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Gly Gly Asp 496 



'111 ^ f^'' AGC ACG CCA CTC GCC CTC AAT CAT CGT 15 3 6 

497 v^iy Ala Arg Val Lys His He Ser Thr Ala Leu Gly Leu Asn His Arg 512 

^VA 11^ °AT AAT ATT GCC CCT CAA CAC GGG K\T GTC 153 4 

513 oln Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 52a 

Th^ ^ '^'^ AAC TCT AAT CTC TTT GTC GTC TCA GCT TTC TAC 163^ 

Tnr Cln Ser Thr Leu Asn Cys Asn Leu Phe val Val ser Ala Phe Ti/r 544' 

^iil ^^I ACA TTC GGG ATA AAT CCA GAG TCT TTC li30 

^>45 „rg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Ph* 560 



1631 TCT GCC TTC CTT GAT CCA TTA GAT CCA GCT ACA CCC ATC CTC TGG CAG 
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551 Cys Aid Leu VaI Asp Arg Leu Asp Ala Gly Thr Giy lie VaI Trp -^In 



1729 C?Ji TTG GCA GGG AAA CCC AC A ATC ACG OTA CCA CAA AAA GAT TCC CCZ r^i 
577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 5--: 



1777 CTG GCG GCG GAT ATT CTG AGT TTG CTG CAA GCG CTA AGT GCG ATT GCT 
1324 

59 3 Leu Ala Ala Asp lie Leu Ser Leu Leu Gin Ala Leu Ser Ala lie Ala 
608 



1825 CAA TGG CAA CAA CAG CAC GAT TTA GAA TTT TCA GCA CTG CTT TTG CTG 1372 
609 Gin Trp Gin Gin Gin His Asp Leu Giu Phe Sar Ala Leu Leu Leu Leu 624 



1373 TTG AGT GAC AAC CCT ATT TCT ACC TCG CAC GGC ACT GAC GAT CAA TTG 1920 
625 Leu Ser Asp Asn Pro lie Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640 



1921 AAC TTT ATC CGT CAA GTG TGG CAG AAC CTA GGC AGT ACG TTT GTG GGT 1968 

64i Asn Phe He Arg Gin Vai Trp Gin Asn Leu Gly Ser Thr Phe Val Gly 656 



1969 GCA ACA TTG TTG TCC CGC AGT GGG GCA CCA TTA GTC GAT ACC AAC GGC 2016 
657 Ala Thr Leu Leu Ser Arg ser Giy Ala Pro Leu Val Asp Thr Asn Gly 672 



2017 CAC GCT ATT GAC TGG TTT GCT CTG CTC TCA GCA GGT AAT AGT CCG CTT 2064 
673 His Ala He Asp Trp Phe Ala Leu Leu Ser Ala Giy Asn Ser Pro Leu odd 



2065 ATC GAT AAG GTT GGT CTG GTG ACT GAT GCT GGC ATA CAA AGT GTT ATA 2112 

639 lie Asp Lys Val Gly Leu Vai Thr Asp Ala Gly He Gin Ser Vai He 704 



2113 GCA ACG GTG GTC AAT ACA CAA AGC TTA TCT GAT GAA GAT AAG AAG CTG 2160 
705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Giu Asp Lys Lys Leu 720 



2161 GCA ATC ACT ACT CTG ACT AAT ACG TTG AAT CAG GTA CAG AAA ACT CAA 2203 
721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin '736 



2209 CAG GGC GTG GCC GTC AGT CTG TTG GCG CAG ACT CTG AAC GTG AGT CAG 2256 
737 Gin Giy Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752 



2257 TCA CTG CCT CCG TTA TTG TTG CGC TGG AGT GGA CAA ACA ACC TAC CAG ^304 
753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 763 



2305 TGG TTG AGT GCG ACT TGG GCA TTG AAG GAT GCC GTT AAG ACT GCC CCC 2352 
769 Trp Leu Ser Ala Thr Trp Aia Leu Lys Asp Ala Val Lys Thr Ala Ala 734 



2353 GAT ATT CCC GCT GAC TAT CTG CGT CAA TTA CGT GAA GTG GTA CGC CGC 2400 

735 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Giu Vai Vai Arg Arg 300 



2401 TCC TTG TTG ACC CAA CAA TTC ACG CTG AGT CCT GCA ATG GTG CAA ACC 2448 
301 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Aia Mec Vai Gin Thr 3i6 
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2445 TTG CTC CAC TAT CCA 3CC TAT TTT CGC OCT TCC CCA O^-A \C- - - 

317 L.U Leu ASP ryr Pro Ala Tyr Phe cly Ala Ser AU ciu ^hr Thr t'.'"' 

2497 GAT ATC ACT TTG TCG ATG CTT TAT ACC CTO AGC TGT TAT ACC <:-T TTi , 

833 ASP He s.r L.u Trp Met Leu Tyr Thr Leu Ser ^1 sTr lit 



in 7^ ATC GOT GAA OCT GOT OCT ACC CAA CAT CAT CTA CTC GCc 

10 349 Leu Leu Gin Mec Gly Olu Ala cly Cly Thr Giu AsJ Asp 5It °u 



3e4 



2593 TAG TTA CGC ACA GCT AAT CCT ACC ACA CCG TIC AGC CAA TCT GAT GCT -^^n 

305 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu s.v Isl mI lU" 

^'^^ "° ACG CTA TTG GGT TCG GAG CTT AAC GAG TTr r- . 

381 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp oTu JJn cT. oTn 9^ 



2689 GCC GCT TGG TCG GTA TTC GGC GGG ATT rrr ao/- 
397 Ala Ala Trp Ser Val Leu ^ ^TC 273. 



'ir, ^^"^ ^"^ ""^ ™= «=A CAC AAC CAA ACT GGT CTT CGC 27a. 

913 ASP Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Sy l7u ^y 323' 



> SIT 5^ °^ '^^'^ ^TC CTC ACT CCT CAC ACT GAT TAT 283 2 

» 929 val Thr Gin Gin Cln Gin cly Tyr Leu Leu Ser Arg Asp Ser Asp 3?r "4 

^lli ^^^^ "'^^ "^"T "^AC GCC CTC CTC CCT CGC GTA TCC CAT 2aao 

945 Thr Leu Trp Cln Ser Thr Gly Gin Ala Leu Val Ala Gly °ll nil HI 

2881 CTC AAC GGC ACT AAC TCA 2898 
96l Val Lys Gly Ser Asn End 966 

(2) INFORMATION FOR SEQ ID NO: 57 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 965 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linea 

r 

(ii) MOLECULE TYPE: procein 

Feacuri?^ SEQUENCE DESCRIPTION: SEQ ID NO: 57 (TccA pepcide) 
Features From To Description 

1 10 SEQ ID NO: 8 



1 


Mec 


Asn 


Gin 


Leu 


Ala 


Ser 


Pro 


17 


Asn 


Lau 


Pro 


Gly 


Lys 


Leu 


Thr 


33 


Val 


Val 


Arg 


Mec 


Pro 


Arg 


Glu 


49 


Leu 


Gly 


Arg 


Ser 


Ala 


Glu 


Lys 


65 


His 


Gin 


Val 


Leu 


His 


His 


Phe 


31 


Gin 


Phe 


Gly 


Leu 


Arg 


Ser 


Pro 



Ic 
32 
43 
64 
30 
95 
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10 



IS 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



9"^ Asn Gin Pha Leu Asp Ala Asn Thr Gly Xrp Lys Asp Lys Ala Pro 3er 
113 Gly Ser Pro Glu Aid Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 
129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 
145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 
161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 
177 He Leu ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 
193 Ala Val Asn Ala Arg Leu Ser Thr Thr Arg Tyr Pro Asn Asn Leu Pro 
209 Tyr His Tyr Gly His Gin Gin He Gin Thr Ala Gin Ser Val Leu Gly 
225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 
241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 
257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 
273 Gin Lys He He Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn 
289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Met Thr He 
3 05 Met Thr Asp Arg Thr Ser Leu Thr Val Pro Gin Val Glu Leu Met Leu 
3 21 Cys Ser Thr Val Gly Gly Ser Thr Val Val Lys Ser Asp Asn Val Ser 
337 Ser Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe He 
3 53 His Ala Gly Lys Pro Glu Ala He Thr Leu Ser Arg Ser Gly Ala Glu 
3 69 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Leu Asp 
335 Arg He Asn Arg Thr Val Arg Leu Gin Lys Trp Leu Asn Leu Pro Tyr 
401 Glu Asp He Asp Leu Leu Val Thr Ser Ala Met Asp Ala Glu Thr Gly 
417 Asn Thr Ala Leu Ser Met Asn Asp Asn Thr Leu Arg Met Leu Gly Val 
433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala 
449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro 
465 Phe Leu Asp Gin Val Phe Asn Ser Val Gly Thr Phe Asp Thr Pro Phe 
431 Val He Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Giy Gly Asp 
497 Gly Ala Arg Val Lys His He Ser Thr Ala Leu Gly Leu Asn His Arg 
513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 
529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 
545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Phe 
561 Cys Ala Leu Val Asp Arg Leu Asp Ala Giy Thr Gly He Val Trp Gin 
577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 
59 3 Leu Ala Ala Asp He Leu Ser Leu Leu Gin Ala Leu Ser Ala He Ala 
609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 



ii: 

128 

144 

ioO 

176 

192 

203 

224 

240 

256 

272 

288 

304 

320 

336 

352 

368 

384 

400 

416 

432 

448 

464 

480 

496 

5i2 

523 

544 

560 

576 

532 

608 

624 
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10 



15 



20 



25 



30 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 58 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4698 base pairs 
<ai TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



55 



60 



65 



or O 

odd 



=:s Leu Ser Asp Asn Pro lie Ser Thr Ser Gin Ciy Thr Asp Asp Cin Leu 
641 Asn Phe II© Arg Gin Vai Trp Gin Asn Leu Gly s«r Thr Phe Val ciy 
5 657 Aid Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 
67 3 His Ala lie Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 
689 lie Asp Lys Val Gly Leu Val Thr Asp Ala Gly He Gin Ser Val lie 704 
705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720 
721 Ala lie Thr Thr Leu Thr Asn Thr Leu Asn Gin Vai Gin Lys Thr Gin 736 
7 37 Gin Gly Val Ala Vai Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752 
753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 
769 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 
735 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 
301 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 
317 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 
333 Asp lie Ser Leu Trp Met Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 
349 Leu Leu Gin Met Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 
865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 
881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 
397 Ala Ala Trp Ser Val Leu Gly Gly lie Ala Lys Thr Thr Pro Gin Leu 9i2 
913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 928 
929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 
945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 
361 Val Lys Gly Ser Asn 965 



753 

734 

300 

316 

332 

843 

364 

830 

896 



960 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 iCccB) 

1 ATG TTA TCG AC A ATG GAA AAA CAA CTG AAT GAA TCC CAG CGT GAT GCC 4: 
1 Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp Aia i- 



49 TTG GTG ACT GGC TAT ATG AAT TTT GTG GCG CCG ACG TTG .\AA GCC GTC 
1. Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Vai 
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5? iGT GGT CAC CCG GTC ACG GTG CAA GAT TTA TAC GAA TAT TTC CTG ATT 144 

3 3 Ser Gly Gin Pro Val Thr Val Glu Asp Leu r/r Clu Tyr Leu Leu ri« 43 

5 145 GAC CCG GAA GTG GCT GAT GAG GTT GAG ACG AGT CGG GTA GCA CA.a GCG 192 

49 Asp Pro Glu Val Aid Asp Giu Vai Giu Thr Ser Arg Vai Aia Gin Ala 64 

193 ATT GCC ACC ATA CAG CAA TAT ATC ACT CGT CTG GTC AAC GGC TCT GAA J40 

10 65 lie Ala Ser He Gin Gin Tyr Mec Thr Arg Leu Vai Asn Oiy Ser Glu 30 



15 



20 



35 



40 



55 



60 



241 CCG GGG CGT CAG CCG ATC CAG CCT TCT ACA GCT AAC CAA TCG CGT GAT 283 

81 Pro Gly Arg Gin Ala Met Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 96 

289 AAT GAT AAC CAA TAT GCT ATC TCG GCT GCG GGG GCT GAG GTT CGA AAT 3 36 

97 Asn Asp Asn Gin T/r Ala He Trp Ala Ala Gly Ala Glu Vai Arg Asn il2 

3 37 TAC GCT GAA AAC TAT ATT TCA CCC ATC ACC CGG CAG GAA AAA AGC CAT 334 

113 Tyr Ala Glu Asn Tyr He Ser Pro He Thr Arg Gin Glu Lys Ser His IZ3 



25 3 35 TAT TTC TCG GAG CTG GAG ACG ACT TTA AAT CAG AAT CGA CTC GAT CCG 43 2 

129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144 

433 GAT CGT GTG CAC GAT GCT GTT TTC GCG TAT CTC AAT GAG TTT GAG GCA 430 

30 145 Asp Arg Vai Gin Asp Ala Vai Leu Ala Tyr Leu Asn Clu Phe Glu Ala 160 



481 GTG AGT AAT CTA TAT GTC CTC ACT OCT TAT ATT AAT CAG GAT AAA TTT 528 

161 Vai ser Asn Leu Tyr Vai Leu Ser Gly Tyr He Asn Gin Asp Lys Phe 176 

529 GAC CAA GCT ATC TAC TAC TTT ATT GGT CGC ACT ACC ACT AAA CCG TAT 576 

177 Asp Gin Ala He Tyr Tyr Phe He Gly Arg Thr Thr Thr Lys Pro Tyr 192 

577 CGC TAC TAC TGG CGT CAG ATG GAT TTC AGT AAG AAC CGT CAA GAT CCG 62 4 

193 Arg Tyr Tyr Trp Arg- Gin Mec Asp Leu Ser Lys Asn Arg Gin Asp Pro 208 



45 625 GCA GGG AAT CCG GTG ACG CCA AAT TGC TGG AAT GAT TGG CAG GAA ATC 672 

209 Ala Giy Asn Pro Vai Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu He 224 

673 ACT TTG CCG CTG TCT GGT GAT ACG GTG CTG GAG CAT ACA GTT CGC CCG 7 20 

50 225 Thr Leu Pro Leu Ser Gly Asp Thr Vai Leu Giu His Thr Vai Arg Pro 240 



721 GTA TTT TAT AAT GAT CGA CTA TAT GTG GCT TGG GTT GAG CGT GAC CCG "63 

241 Vai Phe Tyr Asn Asp Arg Leu Tyr Vai Ala Trp Vai Glu Arg Asp Pro 256 

7 69 GCA GTA CAG AAG GAT CCT CAC CGT AAA AAC ATC GGT AAA ACC CAT GCC 316 

257 Ala Vai Gin Lys Asp Ala Asp Gly Lys Asn He Gly Lys Thr His Ala 272 

317 TAC AAC ATA AAC TTT GCT TAT AAA CCT TAT CAT GAT ACT TGC ACA GCG 364 

273 Tyr Asn He Lys Phe Giy Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala :33 



65 8 65 CCG AJiT ACG ACC ACG TTA ATG ACA CAA CAA GCA GGG GAA AGT TCA G.\A 512 

289 Pro Asn Thr Thr Thr Leu Met Thr Gin Gin Ala Giy Giu Ser S«r Clu 3:4 

-240- 



SUBSmUTE SHEET fflULE 26) 



wo 97/17432 



PCTAJS96/18003 



Vni f°* ^ ^"^ ^ °AT CAA TCT ACC ACC ACA TTC -dC 

305 Thr Gin Arg S«r Ser L*u Leu Ila Asp Clu Sar Ser Thr Thr lIu 



J 60 
220 



ct^ vIT -"^^^ -'^^ ^^'^ ^^'T ^AT CCG ACG GAG 

321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser lie Asp Pro Thr Glu 



\ % » 

J JO 



"5° ^^'^ •'^^'^ -^^"^ ^^'^ "fAT GGC CGC CTA ATC TW GOG OTC TIT CTC I - 
337 Glu Thr ASP s« Asn Pro Tyr Cly Arg Leu Met Leu 0^ 51? 5^ Ssl" 



^ill Arl cti! IT rt* ^ AGA AAA AAT AAA CCC GTT GTT 

353 Arg Gin Phe Glu Gly Asp Gly Al* Asn Arg Lys Asn Lys Pro Val Val 



U04 
363 



1105 TAT GOT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC ACC 
369 Tyr Cly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg hTs 511 III S 



1152 
384 



AAG AAC TTT TTG TTC ACT ACT TAC CGT GAT GAA ACG GAT 
385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg AsJ ctS Asp 



1200 
400 



iSl G?3 1^ AGC TTG CAA TTT GCG GTA TAC GAT AAA AAC TAT GTA ATT 

401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val lie 



1243 
416 



ril JJf SIT ^ ^l" "'^^ <=CC GAA AAT ACA GGA TCG 

417 Thr Lys val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 



1296 
432 



i!I 5It oil ^ JTT TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 

433 Val ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 



1344 
448 



^Itl I*'' r"^ ^^"^ ACG CTT CAT ATA CAA ACC ACA ACT AAT 

449 Tyr He Asp Gin Asp Gly Leu Thr Leu His He Gin Thr Thr Thr Isn 



1392 
464 



^l^-l °AT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 

4o5 Gly Asp Phe He Asn Arg His Thr Phe Gly lyr Asn Asp Leu Val "T/r 



1440 
430 



1441 GAT TCT AAG TCT GCT TAT GCT TTC ACG TCG TCA GGA AAT GAA GGT TIT 
481 Asp ser Lys Ser Gly Tyr Gly Phe Thr Trp ser cly Asn Glu Gly Phe 



1438 
496 



'111 3Jr S ?f I**" 5""^ '^'^ "^^"^ ACC nr CAT AAT GCA ATA 

497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Ti'r Thr Phe His Asn Ala He 



1536 
512 



1537 
513 



tT '^'^'^ ^A TAT GGT GGT GGA TCT GTT CCT AAT GC:. .584 

lie Asn Tyr V/r Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 523 



1585 
529 



^hr ^ ^"^ AAT GAG GGA TCG CCT ATT GCT CCC 16 32 

Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala He Ala Pro 544 



'III fl! ""^ '^^'^ AAG GGC ACT TAT ATC CCT 1630 

545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Cly Ser Tyr He Ala 560 
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lo31 TGG GAA GGC CAA ACA OCT ACC 3GT TAT AAT CTG TAT ATT CCA GnT CGT i->a 
561 Trp Glu Cly Glu Thr Pro Thr Gly Tyr Asn L«u Tyr He Pro Asp oly 576 



1729 ACC GTC rrC CTA GAT TGG TTT GAT AAA ATA AAT TTT GCT ATT GGT CTT 1776 

577 Thr Val Leu L0u Asp Trp Phe Asp Lys I Id Asn Phe Ala He cly Leu 592 



10 



1777 AAT AAG CTT GAG TCT GTA TTT ACG TCG CCA GAT TGG CCA ACA CT\ ACC 1824 
593 Asn L/s Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 608 



IS 



20 



1825 ACT ATC AAA AAT TTC ACT AAA ATC GCC GAT AAC CGC AAA TTC TAT CAG 

609 Thr He Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe Tyr Gin 

1873 GAA ATC AAT OCT GAG ACG GCG GAT GGA CGC AAC CTC TTT AAA CGT TAC 

625 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 

1921 ACT ACT CAA ACT TTC GGA CTT ACC AGC GGT GCG ACT TAT TCT kC\ kCT 

641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 



1372 
624 



1920 
640 



1968 
656 



25 



1969 
657 



TAT ACT TTG TCT GAG GCG GAT TTC TCC ACT GAT CCG GAC AAA AAC TAC 2016 
Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 672 



30 



2017 
673 



CTA CAG GTT TGT TTG AAT GTC GTG TGG GAT CAT TAT GAC CGC CCG TCA 2064 
Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 688 



35 



2065 GGG AAA AAA GCC GCT TAT TCT TGG GTC ACT AAC TGG TTT AAC GTC TAT 
689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 



2112 
704 



40 



2113 GTT GCG TTC CAA GAT AGC AAA GCT CCG GAT GCC ATT CCT CCA TTA GTT 
705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arg Leu Val 



2160 
720 



2161 TCC CGT TAC GAT ACT AAA CGT GGT CTC GTG CAA TAT CTG GAC TTC TGG 2208 
721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 736 



45 



2209 
737 



ACC TCA TCA TTA CCC GCG AAA ACC CGT CTT AAC ACC ACC TTT GTG CGT 2256 
Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 7 52 



50 



2257 
753 



ACT TTG ATT GAG AAG GCT AAT CTG GGC CTG GAT ACT TTG CTG GAT TAC ■>304 
Thr Leu He Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 763 



55 



2305 ACC TTG CAG CCA GAT CCT TCT CTG GAA GCA GAT TTA GTG ACT GAC GGC 
769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 



2352 
734 



60 



2353 AAA AGC GAA CCA ATC GAC TTT AAT GGT TCA AAC GCT CTC TAT TTC TGG 2400 

785 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 300 

2401 GAA TTG TTC TTT CAC CTG CCG TTT TTG GTT GCT ACA CGC TTT GCC \AC M43 

301 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 316 



65 



2449 

817 



GAA CAG CAA TTT TCG CCG GCA CAA AAG ACT TTG CAT TAC ATC TTT GAC 2 AS 6 
Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr He Phe Asp 332 
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2457 CCG GCC ATG K\A AAC AAG CCA CAC AAT GCC CCG GCT TAT TGG AAT GTA 

53 3 Pro Aid Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn VaI 34a 



lU 



:545 
349 



365 



CGT CCC TTC GTT GAA GGA AAC AGC GAT TTG TCA CGT CAT TTG GAC GA? :592 

Arg Pro Lou Val Glu Cly Asn Ser Asp L«u Ser Arg His Lau Asp Asp 864 

TCT ATA GAC CCA GAT ACT C.aj\ GCT TAT GCT CAT CCG GTG ATA TAC CAG 2 64 0 

Ser lie Asp Pro Asp Thr Gin Ala Tyr Aia His Pro Vai lie T*/r Gin 330 



15 



2641 AAA GCG GTG TTT ATT GCC TAT GTC AGT .VAC CTG ATT GCT CAG GGA GAT 2638 
381 Lys Ala Val Phe lie Aia Tyr Vai Ser Asn Leu lie Aia Gin Giy Asp lj6 



20 



2539 ATG TGG TAT CGC CAA TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT GTC 273 6 
897 Met Trp Tyr Arg Gin Leu Thr Arg Asp Giy Leu Thr Gin Aia Arg Val 512 



25 



913 



TAT TAC AAT CTG GCC GCT GAA TTC CTA GCG CCT CGT CCG GA.T GTA TC3 2734 

Tyr Tyr Asn Leu Aia Ala Glu Leu Lau Giy Pro Arg Pro Asp Val Ser 928 



30 



2785 
929 



2833 
945 



CTG AGT AGC ATT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG 2332 

Leu Ser Ser lie Trp Thr Pro Gin Thr Leu Asp Thr Leu Aia Ala Giy 944 

CAA AAA GCG GTT TTA CGT GAT TTT GAG CAC CAG TTG GCT AAT AGT GAT 2330 

Gin Lys Aia Vai Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 960 



35 



2881 ACC GCT TTA CCC GCA TTG CCG GCC CGC AAT GTC AGC TAC TTG AAA CTG 2923 
961 Thr Ala Leu Pro Aia Leu Pro Giy Arg Asn Vai Ser Tyr Leu Lys Leu 976 



40 



2929 
977 



GCA GAT AAT GGC TAC TTT AAT GAA CCC CTG AAT GTT CTG ATG TTG TCT 2976 
Aia Asp Asn Giy Tyr Phe Asn Glu Pro Leu Asn Vai Leu Met Leu Ser 992 



45 



2977 CAC TGG GAT ACG TTG GAT GCA CGG TTA TAC AAT CTG CGT CAT A.^C CTG 3 024 
993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 1003 



50 



3 025 ACC GTT GAT GGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT 

1009 Thr Val Asp Giy Lys Pro Leu Ser Leu Pro Leu Tyr Aia Aia Pro Val 

3 07 3 GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG 

1025 Asp Pro Vai Aia Leu Leu Ala Gin Arg Ala Gin Ser Giy Thr Leu Thr 



J u / 2 
1024 



3120 
1040 



55 



3121 AAT GGC GTC AGT GGC GCC ATG TTG ACG GTG CCG CCA TAC CGT TTC AGC 3153 
1041 Asn Giy Val Ser Giy Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 1056 



3169 GCT ATG TTG CCG CGA GCT TAC AGC GCC GTG GGT ACG TTG ACC AGT TTT 
1057 Ala Met Leu Pro Arg Ala Tyr Ser Aia Vai Giy Thr Leu Thr Ser Phe 



3216 
1072 



65 



3 217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA 
1073 Giy Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Aia C*/s Gin 



5:64 
1033 
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>265 G.iJ^ GAG TTG CCG CAA CAG CAA CTG TTG OAT ATG TCC AGC TAT GCC ATT } 

1039 Glu Glu L«u Ala Gin Gin Gin Leu Leu Asp Met Ser Ser T/r Aia li^ ii:^ 

3 3i3 ACG TTG CAA CAA CAG GCG CTG CAT CGA TTG GCG GCA GAT CGT CTG CCG 3 3 6} 

I 105 Thr Leu Gin Gin Gin Aia Leu Asp Qly Leu Aia Aia Asp Arg Leu Aia ii20 

336i CTC CTA GCT ACT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 3 403 

ii2i Leu Leu Aia Ser Gin Aia Thr Aia Gin Gin Arg His Asp His T/r T/r iijs 

3 409 ACT CTG TAT CAG AAC AAC ATC TCC ACT GCG GAA CAA CTG GTG ATG GAC 3 456 

ii37 Thr Leu Tyr Gin Asn Asn lie Ser Ser Aia Giu Gin Leu Vai Met Asp il52 

3457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GCT CTA CA.=v 3S04 

1153 Thr Gin Thr Ser Ala Gin Ser Leu lie Ser Ser Ser Thr Gly Vai Gin iiod 

3505 ACT GCC ACT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT 3552 

1169 Thr Ala Ser Gly Aia Leu Lys Vai lie Pro Asn lie Phe Gly Leu Aia 1134 

3 55 3 GAT GGC GCC TCC CGC TAT CAA CGA CTA ACG GAA GCG ATT GCC ATC GGG 3 600 

1185 Asp Gly Gly Ser Arg Tyr Giu Gly Vai Thr Glu Aia lie Aia lie Gly 1200 

3601 TTA ATG GCT GCC CGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 3 643 

1201 Leu Met Ala Ala Gly Gin Ala Thr Ser Vai Vai Ala Glu Arg Leu Aia 1216 

3649 ACC ACC GAC AAT TAC CGC CGC CGC CGT CAA GAG TGG CAA ATC CAA TAC 3 656 

1217 Thr Thr Giu Asn Tyr Arg Arg Arg Arg Giu Glu Trp Gin lie Gin Tyr 1232 

3697 CAG CAG GCA CAG TCT GAG CTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 3 744 

1233 Gin Gin Ala Gin Ser Giu Vai Asp Ala Leu Gin Lys Gin Leu Asp Ala 1243 

3745 CTG GCA GTG CGC GAG AAA GCA GCT CAA ACT TCC CTG CAA CAG GCG AAC 3792 

1249 Leu Ala Vai Arg Glu Lys Ala Aia Gin Thr Ser Leu Gin Gin Ala Lys i264 

3793 GCA CAG CAG GTA CAA ATT CGC ACC ATG CTG ACT TAC TTA ACT ACT CGT 3340 

1265 Ala Gin Gin Vai Gin lie Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1230 

3841 TTC ACC CAC CCG ACT CTG TAC CAG TGG CTC ACT GGT CAA TTA TCC GCG 3 333 

1231 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Aia 12^6 

3 389 TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TCC CTC TCC GCC 3 5 36 

1297 Leu Tyr Tyr Gin Ala Tyr Asp Aia Vai Val Ala Leu Cys Leu ser Ala 1312 

39 37 CAA GCT TCC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC 3 534 

1313 Gin Aia Cys Trp Gin Tyr Glu Leu Giy Asp Tyr Ala Thr Thr Phe lie 13 J3 

3985 CAC ACC GCT ACC TGG AAC GAC CAT TAC CGT GGT TTC CAA GTG GGG GAG 4032 

1329 Gin Thr Giy Thr Trp Asn Asp His Tyr Arg Giy Leu Gin Vai Gly Giu 1344 

4033 ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT 4030 

1345 Thr Leu Gin Leu Asn Leu His Gin Met Giu Aia Aia Tyr Leu Vai Arg 13 60 
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30 



45 



50 



60 



4031 CAC GAA CCC C3T CTT AAT GTG ATC CGT ACT GTG TCG CTC AAA AGC CTA 4123 

1361 His Glu Arg Arg Leu Asn V^l lie Arg Thr Vai 3er Leu Lys 3er Leu i3'r6 

4129 TTC GGT GAT GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC 41''6 

1377 Leu Gly Asp Asp Gly Phe Gly L/s Leu Lys Thr Glu Gly Lys Val Asp 13 52 

4177 TTT CCA TTA AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT 4 224 

1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His T^'r 1403 



15 4225 TTG CGC CAG ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG 4272 

1409 Leu Arg Gin lie Lys Thr Val ser Val Thr Leu Pro Thr Leu Val Gly 1424 

4273 CCG TAT CAA AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC ACT ATA 4 320 

20 1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser He 1440 



4321 TTG TTA CCA GCA GAT ATC AAT GGT GTT AAA CGT CTC AAT GAT CCG ACA 4363 

1441 Leu Leu Ala Ala Asp He Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1456 

43 69 GGT AAA GAG GGT GAT GCG ACG CAT ATT GTC ACC AAT CTG CGT GCC AGC 4416 

1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 1472 

4417 CAG CAG GTG GCG CTC TCT TCT GGC ATT AAT GAT GCC GGT AGC TTT GAG 4464 

1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu 1433 



35 4465 TTG CGT TTG GAA GAT CAG CGC TAT CTA TCA TTT GAG GGG ACT GGA GCT 4512 

1489 Leu Arg Leu Glu Asp Glu Arg Tyr Leu ser Phe Glu Gly Thr Gly Ala 1504 

4513 GTT TCC AAA TCG ACT CTT AAC TTC CCG CGT TCT GTG GAT GAG CAT ATT 4560 

40 1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His He 1520 



456Z GAC GAT AAG ACA TTG AAA GCG GAT GAG ATG CAG GCC GCA CTG TTG CCG 4 603 

1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Mec Gin Ala Ala Leu Leu Ala 15 3 6 

4609 AAT ATG GAT GAT GTG CTG GTG CAG GTG CAT TAT ACC GCC TGC GAC GGC 4 656 

1537 Asn Mec Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 1552 

4657 GGC GCC ACT TTC GCA AAC CAG GTC AAG AAA ACA CTC TCT TAA 4698 

15S3 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser End 1566 



55 i2) INFORMATION FOR SEQ ID NO: 59 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1665 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 



(ii) MOLECULE TYPE: procein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59 (TccB peptide) 
65 Features From To Description 
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il SEQ ID MO: 7 



lU 



IS 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



i 


Mec Leu 


Ser 


Thr 


Mec 


Glu 


Lys 


Gin 


Leu Asn Glu 


Ser 


Gin 


Arg 


Asp 




1 6 




Lau Val 


Thr 


Gly 


Tyr 


Mec 


Asn 


Phe 


Val Ala Pro 


Thr 


Leu 


Lys 


Gly 


Val 


52 


33 


Set Gly 


Gin 


Pro 


Val 


Thr 


Val 


Glu 


Asp Leu Tyr 


Glu 


Tyr 


Leu 


Leu 


He 


43 


45 


Asp Pro 


Glu 


Val 


Ala 


Asp 


Glu 


Val 


Glu Thr Ser 


Arg 


Val 


Ala 


Gin 


Ala 


64 


65 


He Ala 


Ser 


He 


Gin 


Gin 


Tyr 


Mec 


Thr Arg Leu 


Val 


Asn Gly 


Ser 


Glu 


30 


31 


Pro Gly 


Arg 


Gin 


Ala 


Mec 


Glu 


Pro Ser Thr Ala Asn Glu Trp Arg Asp 


96 


97 


Asn Asp 


Asn 


Gin 


Tyr 


Ala 


He 


Trp 


Ala Ala Gly 


Ala 


Glu 


Val 


Arg 


Asn 


112 


il3 


T/r Ala 


Glu 


Asn 


Tyr 


He 


Ser 


Pro 


He Thr Arg 


Gin 


Glu 


Lys 


Ser 


Hxs 


123 


129 


Tyr Phe 


Ser 


Glu 


Leu 


Glu 


Thr 


Thr 


Leu Asn Gin 


Asn 


Arg 


Leu 


Asp 


Pro 


144 


145 


Asp Arg 


Val 


Gin 


Asp 


Ala 


val 


Leu 


Ala Tyr Leu 


Asn 


Glu 


Phe 


Glu 


Ala 


160 


161 


Val Ser 


Asn 


Leu 


T/r 


Val 


Leu 


Ser 


Gly T/r He 


Asn 


Gin 


Asp 


Lys 


Phe 


176 


177 


Asp Gin 


Ala 


He 


Tyr 


Tyr 


Phe 


He Gly Arg Thr 


Thr 


Thr 


Lys 


Pro 


T/r 


193 


193 


Arg Tyr 


Tyr 


Trp 


Arg 


Gin 


Mec 


Asp 


Leu Ser Lys 


Asn 


Arg 


Gin 


Asp 


Pro 


203 


209 


Ala Gly 


Asn 


Pro 


Val 


Thr 


Pro 


Asn 


Cys Trp Asn 


Asp 


Trp 


Gin 


Glu 


He 


224 


225 


Thr Leu 


Pro 


Leu 


Ser Gly Asp Thr Val Leu Glu 


His 


Thr 


Val 


Arg 


Pro 


240 


241 


Val Phe 


Tyr 


Asn 


Asp 


Arg 


Leu 


Tyr 


Val Ala Trp 


Val 


Glu 


Arg 


Asp 


Pro 


256 


257 


Ala Val 


Gin 


Lys 


Asp Ala Asp Gly 


Lys Asn He 


Gly 


Lys 


Thr 


His 


Ala 


272 


273 


Tyr Asn 


He 


Lys 


Phe 


Gly 


Tyr 


Lys 


Arg Tyr Asp 


Asp 


Thr 


Trp 


Thr 


Ala 


238 


289 


Pro Asn 


Thr 


Thr 


Thr 


Leu 


Mec 


Thr 


Gin Gin Ala Gly Glu 


Ser 


Ser 


Glu 


304 


305 


Thr Gin 


Arg 

V 


Ser 


Ser 


Leu 


Leu 


He 


Asp Glu Ser 


Ser 


Thr 


Thr 


Leu 


Arg 


.JO 


321 


Gin Val 


Asn 


Leu 


Leu 


Ala 


Thr 


Thr 


Asp Phe Ser 


He 


Asp 


Pro 


Thr 


Glu 


336 


337 


Glu Thr 


Asp 


Ser 


Asn 


Pro Tyr Gly 


Arg Leu Mec 


Leu 


Gly 


Val 


Phe 


Val 


352 


353 


Arg Gin 


Phe 


Glu 


Gly 


Asp 


Gly 


Ala 


Asn Arg Lys 


Asn 


Lys 


Pro 


Val 


Vdl 


363 


369 


Tyr Gly 


Tyr 


Leu 


Tyr 


Cys 


Asp 


Ser 


Ala Phe Asn 


Arg 


His 


Val 


Leu 


Arg 


334 


385 


Pro Leu 


Ser 


Lys 


Asn 


Phe 


Leu 


Phe 


Ser Thr Tyr 


Arg 


Asp 


Glu 


Thr 


Asp 


400 


401 


Gly Gin 


Asn 


Ser 


Leu 


Gin 


Phe 


Ala 


Val Tyr Asp 


Lys 


Lys 


Tyr 


Val 


He 


n X o 


417 


Thr Lys 


Val 


Val 


Thr 


Gly 


Ala 


Thr 


Glu Asp Pro 


Glu 


Asn 


Thr Gly 


Trp 


432 


433 


Val Ser 


Lys 


Val 


Asp 


Asp 


Leu 


Lys 


Gin Gly Thr Thr Gly 


Ala 


Tyr 


Val 


443 


449 


Tyr He 


Asp 


Gin 


Asp 


Gly 


Leu 


Thr 


Leu His He 


Gin 


Thr 


Thr 


Thr 


Asn 


464 


465 


Gly Asp 


Phe 


He 


Asn 


Arg 


His 


Thr 


Phe Gly Tyr Asn Asp Leu Val 


T/r 


430 


431 


Asp Ser 


Lys 


Ser 


Gly 


Tyr 


Gly 


Phe 


Thr Trp Ser 


Gly 


Asn 


Glu 


Gly 


Phe 


4 3 6 


497 


Tyr Leu 


Asp 


Tyr 


His 


Asp 


Gly 


Asn 


Tyr T/r Thr 


Phe 


His 


Asn 


Ala 


He 


512 
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IS 



!0 



25 



30 



35 



40 



45 



50 



55 
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65 





lU 


Asn 


Ti'r 


T/r 


Pro 


Ser 


Gly 


T/r 


Gly 


Gly 


Gly 


Ser 


Val 


Pro 


Asn 


t . . 




529 


Thr 


Trp 


Ala 


Leu 


Glu 


Gin 


Arg 


lie 


Asn 


Glu 


Gly 


Trp 


Ala 


He 


Ala 


Pro 


!44 


545 


Lau 


Leu 


Asp 


Thr 


Leu 


His 


Thr 


Val 


Thr 


Val 


Lys 


Gly 


Ser 


T/r 


He 


Ala 


560 


561 


Trp 


Glu 


Gly 


Glu 


Thr 


Pro 


Thr 


Gly 


Tyr 


Asn 


Leu 


T/r 


He 


Pro 


Asp 


Gly 


576 




Thr 


Val 


Leu 


Leu 


Asp 


Trp 


Phe 


Asp 


Lys 


He 


Asn 


Phe 


Ala 


He 


Gly 


Leu 


552 


553 


Asn 


Lys 


Leu 


Glu 


Ser 


Val 


Phe 


Thr 


Ser 


Pro 


Asp 


Trp 


Pro 


Thr 


Leu 


Thr 


603 


6C9 


Thr 


lie 


Lys 


Asn 


Phe 


Ser 


Lys 


He 


Ala 


Asp 


Asn 


Arg 


Lys 


Phe 


T/r 


Gin 


624 


625 


Glu 


lie 


Asn 


Ala 


Glu 


Thr 


Ala 


Asp 


Gly 


Arg 


Asn 


Leu 


Phe 


Lys 


Arg 


T/r 


640 


641 


Ser 


Thr 


Gin 


Thr 


Phe 


Gly 


Leu 


Thr 


Ser 


Gly 


Ala 


Thr 


Tyr 


Ser 


Thr 


Thr 


655 


657 


Tyr 


Thr 


Leu 


Ser 


Glu 


Ala 


Asp 


Phe 


Ser 


Thr 


Asp 


Pro 


Asp 


Lys 


Asn 


T/r 


6-2 


673 


Leu 


Gin 


Val 


cys 


Leu 


Asn 


Val 


Val 


Trp 


Asp 


His 


T/r 


Asp 


Arg 


Pro 


Ser 


638 


639 


Gly 


Lys 


Lys 


Gly 


Ala 


Tyr 


Ser 


Trp 


Val 


Ser 


Lys 


Trp 


Phe 


Asn 


Val 


T/r 


704 


705 


val 


Ala 


Leu 


Gin 


Asp 


Ser 


Lys 


Ala 


Pro 


Asp 


Ala 


He 


Pro 


Arg 


Leu 


Val 


720 


721 


Ser 


Arg 


Tyr 


Asp 


Ser 


Lys 


Arg 


Gly 


Leu 


Val 


Gin 


Tyr 


Leu 


Asp 


Phe 


Trp 


736 


737 


Thr 


Ser 


Ser 


Leu 


Pro 


Ala 


Lys 


Thr 


Arg 


Leu 


Asn 


Thr 


Thr 


Phe 


Val 


Arg 


752 


753 


Thr 


Leu 


lie 


Glu 


Lys 


Ala 


Asn 


Leu 


Gly 


Leu 


Asp 


Ser 


Leu 


Leu 


Asp 


T/r 


763 


769 


Thr 


Leu 


Gin 


Ala 


Asp 


Pro 


Ser 


Leu 


Glu 


Ala 


Asp 


Leu 


Val 


Thr 


Asp 


Gly 


734 


735 


Lys 


Ser 


Glu 


Pro 


Met 


Asp 


Phe 


Asn 


Gly 


Ser 


Asn 


Gly 


Leu 


Tyr 


Phe 


Trp 


300 


801 


Glu 


Leu 


Phe 


Phe 


His 


Leu 


Pro 


Phe 


Leu 


Val 


Ala 


Thr 


Arg 


Phe 


Ala 


Asn 


315 


317 


Glu 


Gin 


Gin 


Phe 


Ser 


Pro 


Ala 


Gin 


Lys 


Ser 


Leu 


His 


Tyr 


He 


Phe 


Asp 


332 


333 


Pro 


Ala 


Mec 


Lys 


Asn 


Lys 


Pro 


His 


Asn 


Ala 


Pro 


Ala 


Tyr 


Trp 


Asn 


VaI 


343 


349 


Arg 


Pro 


Leu 


Val 


Glu 


Gly 


Asn 


Ser 


Asp 


Leu 


Ser 


Arg 


His 


Leu 


Asp 


Asp 


364 


365 


s«r 


lie 


Asp 


Pro 


Asp 


Thr 


Gin 


Ala 


Tyr 


Ala 


His 


Pro 


Val 


He 


T/r 


Gin 


380 


881 


Lys 


Ala 


Val 


Phe 


He 


Ala 


Tyr 


Val 


Ser 


Asn 


Leu 


He 


Ala 


Gin 


Gly 


Asp 


396 


3S7 


Mac 


Trp 


Tyr 


Arg 


Gin 


Leu 


Thr 


Arg 


Asp 


Gly 


Leu 


Thr 


Gin 


Ala 


Arg 


val 


912 


913 


Tyr 


Tyr 


Asn 


Leu 


Ala 


Ala 


Glu 


Leu 


Leu 


Gly 


Pro 


Arg 


Pro 


Asp 


Val 


Ser 


923 


929 


Leu 


Ser 


Ser 


He 


Trp 


Thr 


Pro 


Gin 


Thr 


Leu 


Asp 


Thr 


Leu 


Ala 


Ala 


Gly 


944 


945 


Gin 


Lys 


Ala 


Val 


Leu 


Arg 


Asp 


Phe 


Glu 


His 


Gin 


Leu 


Ala 


Asn 


Ser 


Asp 


360 


961 


Thr 


Ala 


Leu 


Pro 


Ala 


Leu 


Pro 


Gly 


Arg 


Asn 


Val 


Ser 


Tyr 


Leu 


Lys 


Liru 


576 


977 


Ala 


Asp 


Asn 


Gly 


Tyr 


Phe 


Asn 


Glu 


Pro 


Leu 


Asn 


Val 


Leu 


Mec 


Leu 


Ser 


592 


993 


His 


Trp 


Asp 


Thr 


Leu 


Asp 


Ala 


Arg 


Leu 


Tyr 


Asn 


Leu 


Arg 


His 


Asn 


Leu 


L003 


L009 


Thr 


Val 


Asp 


Gly 


Lys 


Pro 


Leu 


Ser 


Leu 


Pro 


Leu 


Ti'r 


Ala 


Ala 


Pro 


Val 


1024 


■25 


Asp 


Pro 


Val 


Ala 


Leu 


Leu 


Ala 


Gin 


Arg 


Ala 


Gin 


Ser 


Gly 


Thr 


Leu 


Thr 


1040 



-247- 



1041 Asn Cly 

1057 Ala Met 

5 

1073 Gly Gin 

1089 Glu Glu 

lU 1105 Thr Leu 

1121 Leu Leu 

1137 Thr Leu 

15 

1153 Thr Gin 

1169 Thr Ala 

20 1185 Asp Gly 

1201 Leu Met 

1217 Thr Thr 

25 

1233 Gin Gin 

1249 Leu Ala 

30 1265 Ala Gin 

1281 Phe Thr 

1297 Leu Tyr 

35 

1313 Gin Ala 

1329 Gin Thr 

40 1345 Thr Leu 

1361 His Glu 

1377 Leu Gly 

45 

1393 Phe Pro 

1409 Leu Arg 

50 1425 Pro Tyr 

1441 Leu Leu 

1457 Gly Lys 

55 

1473 Gin Gin 

1489 Leu Arg 

60 1505 Val ser 

1521 Asp Asp 

1537 Asn Met 

65 

1553 Gly Ala 



Val 


Ser 


Gly 


Ala 


Mec 


Leu 


Pro 


Arg 


Ala 




Asn 


Leu 


Leu 


Ser 


Leu 


Leu 


Ala 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Ala 


Leu 


Ala 


Ser 


Gin 


Ala 


Thr 


Tyr 


Gin 


Asn 


Asn 


He 


Thr 


Ser 


Ala 


Gin 


Ser 


Ser 


Gly 


Ala 


Leu 


Lys 


Gly 


Ser 


Arg 


Tyr 


Glu 


Ala 


Ala 


Gly 


Gin 


Ala 


Glu 


Asn 


Tyr 


Arg 


Arg 


Ala 


Gin 


Ser 


Glu 


Val 


Val 


Arg 


Glu 


Lys 


Ala 


Gin 


Val 


Gin 


He 


Arg 


Gin 


Ala 


Thr 


Leu 


Tyr 


Tyr 


Gin 


Ala 


Tyr 


Asp 


Cys 


Trp 


Gin 


Tyr 


Glu 


Gly Thr 


Trp 


Asn 


Asp 


Gin 


Leu 


Asn 


Leu 


His 


Arg 


Arg 


Leu 


Asn 


Val 


Asp 


Asp 


Gly 


Phe Gly 


Leu 


Ser 


Glu 


Lys 


Leu 


Gin 


lie 


Lys 


Thr 


Val 


Gin 


Asn 


Val 


Lys 


Ala 


Ala 


Ala 


Asp 


He 


Asn 


Glu Gly 


Asp 


Ala 


Thr 


Val 


Ala 


Leu 


Ser 


Ser 


Leu 


Glu 


Asp 


Glu 


Arg 


Lys 


Trp 


Thr 


Leu 


Asn 


Lys 


Thr 


Leu 


Lys 


Ala 


Asp 


Asp 


Val 


Leu 


Val 


Ser 


Phe 


Ala 


Asn 


Gin 



Leu 


Thr 


Val 


Pro 


Pro 


Ser 


Ala 


Val 


Gly 


Thr 


Leu 


Glu 


Arg 


Ser 


Glu 


Leu 


Lau 


Asp 


Mec 


Ser 


Asp Cly 


Leu 


Ala 


Ala 


Ala 


Gin 


Gin 


Arg 


His 


Ser 


Ser 


Ala 


Glu 


Gin 


Leu 


He 


Ser 


Ser 


Ser 


Val 


He 


Pro 


Asn 


He 


Gly 


Val 


Thr 


Glu 


Ala 


Thr 


Ser 


Val 


Val 


Ala 


Arg 


Arg 


Glu 


Glu 


Trp 


Asp 


Ala 


Leu 


Gin 


Lys 


Ala 


Gin 


Thr 


Ser 


Leu 


Thr 


Met 


Leu 


Thr 


Tyr 


Gin 


Trp 

■ 


Leu 


Ser 


Gly 


Ala 


Val 


Val 


Ala 


Leu 


Leu Gly 


Asp 


Tyr 


Ala 


His 


Tyr 


Arg 


Gly 


Leu 


Gin 


Mec 


Glu 


Ala 


Ala 


He 


Arg 


Thr 


Val 


Ser 


Lys 


Leu 


Lys 


Thr 


Glu 


Phe 


Asp 


Asn 


Asp 


Tyr 


Ser 


Val 


Thr 


Leu 


Pro 


Thr 


Leu 


Thr 


Gin 


Thr 


Gly 


Val 


Lys 


Arg 


Leu 


His 


He 


Val 


Thr 


Asn 


Gly 


He 


Asn 


Asp 


Ala 


Tyr 


Leu 


Ser 


Phe 


Glu 


Phe 


Pro 


Arg 


Ser 


Val 


Asp 


Glu 


Mec 


Gin 


Ala 


Gin 


Val 


His 


Tyr 


Thr 


Val 


Lys 


Lys 


Thr 


Leu 
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Tyr 


Arg 


Phe 


3er 


1 C'56 


Leu 


Thr 


Ser 


Phe 


10-: 


Arg 


Ala 


cys 


Gin 


lodd 


Ser 


T/r 


Ala 


He 


1104 


Asp 


Arg 


Leu 


Ala 


112C 


Asp 


His 


Tyr 


Tyr 


1136 


Leu 


Val 


Mec 


Asp 


1152 


Thr 


Gly 


Val 


Gin 


Hod 


Phe 


Gly 


Leu 


Ala 


1134 


He 


Ala 


He 


Gly 


1200 


Glu 


Arg 


Leu 


Ala 


1216 


Gin 


He 


Gin 


Tyr 


1232 


Gin 


Leu 


Asp 


Ala 


1243 


Gin 


Gin 


Ala 


Lys 


1264 


Leu 


Thr 


Thr 


Arg 


1280 


Gin 


Leu 


Ser 


Ala 


1296 


Cys 


Leu 


Ser 


Ala 


1312 


Thr 


Thr 


Phe 


He 


1328 


Gin 


Val 


Gly 


Glu 


1344 


Tyr 


Leu 


Val 


Arg 


1360 


Leu 


Lys 


Ser 


Leu 


1376 


Gly 


Lys 


Val 


Asp 


1392 


Pro 


Gly 


His 


Tyr 


1403 


Thr 


Leu 


Val 


Gly 


1424 


Ser 


Ser 


Ser 


He 


1440 


Asn 


Asp 


Pro 


Thr 


1456 


Leu 


Arg 


Ala 


Ser 


1472 


Gly 


Ser 


Phe 


Glu 


1488 


Gly 


Thr 


Gly 


Ala 


1504 


Asp 


Glu 


His 


He 


1520 


Ala 


Leu 


Leu 


Ala 


15j6 


Ala 


Cys 


Asp 


Gly 


1552 



Sar 1S65 



SUBSTITUTE SHEET (RULE 25) 



wo 97/17432 



PCTA;S96/18003 



(2) INFORMATION FOP. SEQ ID NO: 60 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3132 base pairs 

(B) TYPE: nucleic acid 

(C) STRA1>IDEDNESS : double 

(D) TOPOLOGY: linear 

(ii; MOLECULE TYPE: DNA < genomic J 



IS 



20 



25 



30 



35 



40 



45 



50 



55 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 iCCCC) 

1 ATG ACT CCC TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA GTC \GC 4 8 

1 Mec ser Pro Ser Clu Thr Thr Leu T/r Thr Gin Thr Pro Thr Val Ser 16 

49 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CCT 96 

17 Val Leu Asp Asn Arg Gly Leu Ser He Arg Asp He Gly Phe His Arg 3 2 

97 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT 144 

33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 48 

145 GAT GCC CGT GCA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTC TAT GAT 192 

49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64 

193 GCA AAG CAC CCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG C\G CAT "►40 

o5 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 30 

241 GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT 238 

81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 

289 ACT GTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT 33 6 

97 Thr Val Ala Leu Asn Asp Ho Glu Gly Arg Ser Val Met Thr Met Asn UZ 

3 37 GGG ACC GGT GTT CGT CAG ACC CGT CGC TAT GAA GGC AAC ACC TTG CCC 3 34 

113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 123 

385 GGT CGC TTG TTA TCT GTG AGC GAG CAA GTT TTC AAC CAA GAG AGT GCT 4 32 

129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144 

433 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG AAA 430 

145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 160 

431 GAG TAT AAC CTC TCC GGT CTG TCT ATA CGC CAC TAC GAC ACA GCG GGA 523 

loi Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 176 



A/1 HI AG'T CAG TCA CTC GCC GGC GCC ATG CTA TCC CkA 5*' 6 

1/7 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 1^2 



577 TCT CAC CAA TTG CTC GCG GAA GGG CAG GAG GCT AAC TGG AGC GGT 0\C 624 

193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 203 
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625 GAC GAA ACT GTC TGC CAG GGA ATC CTG GCA AGT GAG GTC TAT ACG AC A z^l 
205 Asp Glu Thr Vai Trp Gin Gly Mec L«u Aia Ser Glu Val r/r Thr Tnr 2^4 



67 3 CAA ACT ACC ACT A.AT CCC ATC GGG GCT TTA CTG ACC CAA ACC GAT CCG 7 20 
225 Gin S«r Thr Thr Asn Aid lie Giy Aid Leu Leu Thr Gin Thr Asp Aid 240 



?2i AAA GGC .\AT ATT GAG CGT CTG GCT TAT GAC ATT CCC GGT CAG TTA AJLn "63 
24i Lys Giy Asn lie Gin Arg Leu Aid Tyr Asp lie Aid Giy Gin L«u Lys 256 



769 GGG AGT TGG TTG ACC CTG AAA GGC CAG AGT CAA CAG CTG ATT GTT AAG 316 
257 Giy Ser Trp Leu Thr Val Lys Giy Gin Ser Giu Gin Vai lie Vai Lys 272 



317 TCC CTG ACC TGG TCA CCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT 364 
273 Ser Leu Ser Trp Ser Aid Aid Giy His Lys Leu Arg Giu Giu His Giy 233 



865 AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG 912 
289 Asn Giy Vai Vdi Thr Giu Tyr Ser T^'r Glu Pro Giu Thr Gin Arg Leu 304 



913 ATA GGT ATC ACC ACC CGG CGT GCC GAA GGG AGT CAA TCA GGA CCC AG A 960 
305 lie Giy lie Thr Thr Arg Arg Ala Giu Giy Ser Gin Ser Giy Ala Arg 320 



961 GTA TTG CAG GAT CTA CGC TAT AAG TAT GAT CCG GTG GGG AAT GTT ATC 1008 

321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Giy Asn Val lie 336 



1009 AGT ATC CAT AAT GAT GCC GAA GCT ACC CGC TTT TGG CGT AAT CAG AAA i05o 

3 37 Ser lie His Asn Asp Ala Giu Ala Thr Arg Phe Trp Arg Asn Gin Lys 352 



1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATC 1104 
353 Vai Giu Pro Glu Asn Arg Tyr Vai Tyr Asp Ser Leu Tyr Gin Leu Mec 3 68 



1105 AGT CCG AC A GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 1152 

369 Ser Ala Thr Giy Arg Giu Mec Ala Asn He Giy Gin Gin Ser Asn Gin 384 



1153 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 1200 

3 85 Leu Pro Ser Pro Val He Pro Val Pro Thr Asp Asp Ser Thr r/r Thr 4 00 



1201 AAT TAC CTT CGT ACC TAT ACT TAT GAC CGT GGC GCT AAT TTG GTT CAA 1243 
401 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Giy Giy Asn Leu Val Gin 416 



1249 ATC CCA CAC AGT TCA CCC CCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 1296 

417 lie Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp lie 432 



1297 ACC GTT TCA AGC CGC AGT AAC CGG CCG GTA TTG AGT ACA TTA ACG ACA 1344 
433 Thr Vai Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 443 



1345 GAT CCA ACC CGA GTG GAT GCC CTA TTT GAT TCC GGC GGT CAT CAG A.-.G 139 2 
449 Asp Pro Thr Arg Vai Asp Ala Leu Phe Asp Ser Giy Giy His Gin Lys 464 



139 3 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG 1440 
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1441 CAA CGA CTC AC A CCC CTC ACC CGT GAA AAT ACC ACT CAC ACT GAA TGO 1435 

431 Gin Arg Val Thr Pro Val Sor Arg Glu Asn Ser Ser Asp Ser Glu Trp 49o 

1439 TAT CGC TAT AGC AGT GAT GCC ATG CGG CTG CTA AAA GTG AGT GAA CAC 1556 

497 T/r Arg Tyr Ser Ser Asp Giy Mec Arg Leu Leu Lys Val Ser Glu Oln 512 

1537 CAG ACG GGC AAC AGT ACT CAA CTA CAA CGG GTG ACT TAT CTG CCG GGA 1534 

513 Gin Thr Cly Asn Ser Thr Gin Vdl Gin Arg Val Thr Tyr Leu Pro Gly 528 

1535 TTA GAG CTA CGG AC A ACT GGG GTT GCA GAT AAA AC A ACC GAA GAT TTG 16 3 2 

529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544 



20 1633 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 1630 

545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 560 

1581 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTG CGC 1723 

25 561 His Trp Glu Ser Gly Lys Pro Thr Asp He Asp Asn Asn Gin Val Arg 576 



1729 TAC AGC TAC GAT AAT CTG CTT GGC TCC AGC CAG CTT GAA CTG GAT AGC 1776 

577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592 

1777 GAA GGG CAG ATT CTC AGT CAG CAA GAG TAT TAT CCG TAT GGC GGT ACG 1324 

593 Glu Giy Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 603 

1825 GCG ATA TGG GCG GCC AGA AAT CAG ACA GAA GCC AGC TAC AAA TTT ATT 1372 

609 Ala He Trp Aia Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe lie 624 



40 1373 CGT TAC TCC GGT AAA GAG CGG GAT GCC ACT GGA TTG TAT TAT TAC GGC 1920 

625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Giy Leu Tyr Tyr Tyr Gly 640 

1321 TAC CGT TAT TAT CAA CCT TGG GTG GGT CGA TGG TTG AGT GCT GAT CCG ISfod 

45 641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu ser Ala Asp Pro 6 56 



1969 GCG GGA ACC GTG GAT GGG CTG AAT TTG TAC CGA ATG GTG AGG AAT .^AC 2 016 

657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Mec Val Arg Asn Asn 672 

2017 CCC ATC ACA TTG ACT GAC CAT GAC GGA TTA GCA CCG TCT CCA AAT AGA 2064 

673 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 633 

2065 AAT CGA AAT ACA TTT TGG TTT GCT TCA TTT TTG TTT CGT AAA CCT C.-.T 2iiJ 

689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 7 04 



60 2113 GAG GGA ATG TCC GCG TCA ATG AGA CGG GGA CAA AAA ATT GGC AGA GCC 2160 

705 Glu Giy Mec Sier Ala Ser Mec Arg Arg Gly Gin Lys He Giy Arg Ala "20 

2161 ATT GCC GGC GGG ATT GCG ATT GGC GGT CTT GCG GCT ACC ATT GCC GCT 2203 

65 721 He Ala Gly Giy He Ala He Giy Gly Leu Ala Ala Thr He Ala Ala ^36 
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2209 ACC CCT CCC GCG GCT ATC CCC GTC ATT CTC GCG CTT GCG GCC GTA GGC 
?37 Thr Aid Gly Ala Ala lie Pro Val Zl« Lau Gly Vai Ala Ala Vai Jiy 



2257 GCG GGG ATT GCC GCG TTG ATG GGA TAT AAC GTC GGT AGC CTG CTG GK\ 
"53 Ala Gly lie Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Ldu Leu Glu 



763 



10 



23 05 .=iAA GGC GGG GCA TTA CTT GCT CGA CTC GTA CAG GGG AAA TCG ACG TTA ^3^:^ 
769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr L«u 734" 



IS 



23 53 GTA CAG TCG GCG CCT CCC GCG CCT GCC GGA GCG ACT TCA GCC GCG GCT 2400 

785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 300 



20 



2401 TAT GGC GCA CGG CCA CAA GGT GTC GGT GTT GCA TCA GCC GCC GGG GCG "^443 
801 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 316 



25 



2449 
817 



2497 
833 



GTA ACA GCG GCT GTG CGA TCA TCC ATA AAT AAT GCT GAT CGG GGG ^TT ^^96 

Val Thr Gly Ala Val Gly Ser Trp lie Asn Asn Ala Asp Arg Gly lie 332 

GGC GGC GCT ATT GGG GCC GGG ACT GCG GTA CCC ACC ATT GAT ACT ATG "^544 

Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Mec 348 



30 



2545 TTA GGC ACT GCC TCT ACC CTT ACC CAT CAA GTC GGG GCA GCG GCG GGT 2592 
849 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 364 



35 



2593 CGG GCG GCG GGT GGG ATG ATC ACC GGT ACG CAA GGG ACT ACT CGG CCA 

865 Gly Ala Ala Gly Gly Mec He Thr Gly Thr Gin Gly Ser Thr Arg Ala 



2640 
880 



40 



2641 GGT ATC CAT GCC GCT ATT GCC ACC TAT TAT GGC TCC TGC ATT GGT TTT 2688 
881 Gly He His Ala Gly He Gly Thr Tyr Tyr Gly ser Trp He Gly Phe 396 



45 



2689 GCT TTA GAT GTC CCT ACT AAC CCC GCC GGA CAT TTA GCG AAT TAC GCA 27 3 6 

897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 912 

2737 GTG GCT TAT GCC GCT GGT TTG GCT GCT CAA ATC CCT CTC AAC ACA ATA 2734 

913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Mec Ala Val Asn Arg He 328 



50 



2785 ATG GGT GGT GGA TTT TTG ACT ACG CTC TTA GGC CGG CTT GTC AGC CCA 2332 
929 Mec Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944 



55 



2833 
945 



TAT CCC GCC GGT TTA GCC ACA CAA TTA GTA CAT TTC ACT GTC GCC AG A 2330 
Tyr Ala Ala Gly Leu Aia Arg Gin Leu Val His Phe Ser Val Ala Arg 560 



60 



2881 
961 



CCT GTC TTT GAG CCC ATA TTT ACT CTT CTC GGC GGG CTT GTC GGT GGT 2923 
Pro Vai Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 976 



65 



2929 ATT GGA ACT GGC CTG CAC ACA CTG ATC GGA AGA GAG ACT TGC ATT TCC 29*7 6 
577 lie Gly Thr Gly Leu His Arg Val Mec Gly Arg Glu Ser 



Trp lie Ser 



992 



2977 ACA CCC TTA ACT GCT GCC GGT ACT GGT ATA GAT CAT GTC GCT GGC ATG .>024 
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65 



Arg Ala Leu Ser Ala Ala Gly Ser 3iy lie Asp His Val Ala 3ly Me: : :• : i 

3025 ATT GGT AAT CAG ATC AGA GGC AGO GTC TTG ACC AC A ACC GGG ATC GCT 

1009 lie Gly Asn Gin lie Arg Gly Arg Val Leu Thr Thr Thr Gly lie Ala i j^j 

3 0'' 3 .-JkT GCC ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC CCA CGA CCA GTT 112 0 

1025 Asn Ala He Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val 1040 

3121 TTT TCT TTG TAA 313 2 

1041 Phe Ser Leu End 1043 



(2) INFORMATION FOR SEQ ID NO: 61 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1043 amino acids 

(B) TYPE: amino acid 
20 (C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 (TccC pepcide) 
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Lys 


Gly 


Asn 


He 


Gin 


Arg 


Leu 


Ala 


Tyr 


Asp 


He 


Ala 


Gly 


Gin 


Leu 


Lys 
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Asn 


Gly 


Val 


Val 


Thr 


Glu 
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Glu 
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Arg 
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J-U Val L«u Cln Asp L«u Arg T/r L/s T/r Asp Pro Val 01/ Asn Vai li* 
3 37 ier He His Asn Asp Ala ciu Aid Thr Arg Phe Tip Arg Asn Cln Lys 
353 V4i Glu Pro Giu Asn Arg T/r Val T/r Asp Ser L*u TVr Gin L=u Met 
36S ser Ala Thr Cly Arg Glu Mec Ala Asn He Cly Gin Gin Ser Asn Gin 
335 Leu Pro Ser Pro Val He Pro Val Pro Thr Asp Asp ser Thr T-/r Thr 
JOl Asn Tyr Leu Arg Thr T-/r Thr Tyr Asp Arg Gly cly Asn Leu Val Gin 
417 He Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp He 
43 3 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 
449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Cly Gly His Gin Lys 
465 Met Leu He Pro Gly Gin Asn Leu Asp Trp Asn He Arg Gly Glu Leu 
431 Gin Arg Val Thr Pro Val ser Arg Glu Asn Ser Ser Asp ser Glu Trp 
497 Tyr Arg Tyr Ser Ser Asp Gly Mec Arg Leu Leu Lys Val Ser Glu Gin 
513 Gin Thr Gly Asn Ser Thr Gin Val Cln Arg Val Thr Tyr Leu Pro cly 
529 Leu Ciu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Clu Asp Leu 
545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 
561 His Trp Glu ser Gly Lys Pro Thr Asp He Asp Asn Asn Cln Val Arg 
577 Ti'r ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 
593 Glu Gly Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 
609 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe He 
625 Arg Tyr Ser cly Lys Clu Arg Asp Ala Thr Gly Leu Tyr Tyr TVr Gly 
641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro 
657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Mec Val Arg Asn Asn 
673 Pro He Thr. Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 
689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 
705 Glu Gly Mec Ser Ala ser Mec Arg Arg Gly cln Lys He Gly Arg Ala 
721 He Ala Gly Gly He Ala He Cly Cly Leu Ala Ala Thr He Ala Ala 
737 Thr Ala Gly Ala Ala He Pro Val He Leu Gly Val Ala Ala Val Gly 
753 Ala Gly He Gly Ala Leu Mec Gly Tyr Asn Val Gly Ser Leu Leu Giu 
769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val cln Gly Lys Ser Thr Leu 
785 val Cln Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 
801 Tyr Gly Ala Arg Ala cln Gly Val Gly Val Ala Ser Ala Ala Gly Ala 
317 Val Thr Gly Ala Val Gly Ser Trp He Asn Asn Ala Asp Arg cly He 
333 Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Mec 
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720 



736 

763 
•^34 
300 
316 



o J 2 



o4 3 



SUBSTITUTE SHEET (RULE 2^ 



WU 97/17432 



PCT/US96/18003 



10 



15 



20 



943 

W V t0 


LeU 




Thr 


.-.la 3£r 


Thr 


Leu 


Thr 


His 


:;iu 


Val 






Ala 


AU 


^ * ■ • 

* 




^ w ^ 


civ 


\1a 


Ala Gly Gly 


Mec 


He 


Thr 


Gly 


Thr 


Gin 


1** 1 \r 

v.*Ay 


9«r 


Thr 


Arg 




« • * 


331 

W W A 


Glv 


lis 


His 


Ala Gly 


He 


Gly Thr 


Tyr 


T/r Gly 


der 


Trp 


He Gly 






35'' 


Glv 




Asp 


Val Ala 


Ser 


Asn 


Pro Ala Gly 


His 


ueu 


Aia 


Asn 


Tyr 


AU 




ill 


' CI X 


V>« X J 


Tyr 


Ala Ala Gly 


Leu 


Gly 


Ala 


Glu 


Mec 


* 1 » 
.nia 


Val 


Asn 


Arg 




323 


GOG 
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We Claim: 



1. A composition, comprising an ef receive amounc of a 
Phozorhabdus procein coxin chac has functional activity against 

5 an insect . 

2. The composition of Claim 1. wherein the Photorhabdus 
coxin is produced by a purified culture of Photorhabdus, a 
transgenic plant, Baculovirus, or heterologous microbial host. 

10 

3. The composition of Claim 2, wherein the Photorhabdus 
toxin produced by a purified culture of Photorhabdus lusnim^scens . 

4. The composition of Claim 2, wherein the toxin is 

15 produced from a purified culture of Photorhabdus luminescens 
strain designated ATCC 55397, 

5. The composition of Claim 2, wherein the toxin is 
produced by a purified culture of Photorhabdus luminescens strain 

20 designated W-14. 

6. The composition of Claim 1, wherein the toxin is 
produced by a purified culture of P/iocorhaJbdus strain designated 
WX-1, WX-2. WX-3. WX-4, WX-5, WX6. WX-8, WX-9, WX-10, WX- 

25 11, WX-12, WX-14, WX-15, H9, Hb, Hm, HP88, NC-1. W30. WIR, ATCC# 
43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, or ATCC# 43952. 

7. The composition of Claim 2, wherein the toxin is 
produced from a purified culture of Photorhabdus luminescens 

3G strain designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, WX-8. 
WX-9. WX-10, WX-11. WX-12. WX-14, WX-15, H9. Hb, Hm, HP88, NC-i, 
W30, WIR, ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 43951. or 
ATCC# 4 3 952. 

35 8. The composition of Claim 1, wherein the toxin is 

respresented by amino acid sequence is SEQ ID NO: 12. 

9. The composition of Claim 6, wherein the composition is a 
mixture of one or more toxins produced from purified cultures of 
4i) Photorhabdus . 
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10. The composition of Claim i or 6. wherein che insecc i 
of Che order Lepidopcera. Coleopcera. Hymenopcera. Dipcara, 
Diccyopcera, Acarina or Homopcera. 

11. The composition of Claim 1 or 6, wherein the insecc 
species is from order C^ieopcera and is Southern Corn Rootworm. 
western Corn Rootworm. Colorado Potato Beetle. Mealworm. Boll 
Weevil or Turf Grub. 

■ 

12. The composition of Claim 1 or 6, wherein the insecc 
species is from order Lepidopcera and is Beet Armyworm. Black 
Cutworm, cabbage Looper. Codling Moth. Corn Earworm, European 
Corn Borer, Tobacco Hornworm, or Tobacco Budworm. 

13. The composition of Claim 1 or 6. wherein the coxm is 
formulaced as a sprayable insecticide. 

14. The composition of Claim 1 or Claim 6, wherein the 
coxin is formulated as a bait matrix and delivered in an above 
ground or below ground bait scat ion. 

15. A method of controlling an insect, comprising orally 
delivering to an insect an effective amount of a protein toxin 
chat has functional accivicy against an insect, wherein the 
protein is produced by a purified bacterial culture of che genus 
Phocorhabdus. 

16. The method o£ Claim 15. wherein the bacterium is a 
purified culture of Phocorhabdus luminescens . 

17. The method of Claim 15, wherein che toxin is produced 
from a purified culture of Phocorhabdus luminescens scrain 
designaced ATCC 55397. 

18. The method of Claim 16. wherein che coxin is produced 
from a purified culcure of Phocorhabdus luminescens scrain 
designaced W-14. 
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19. The mechod of Claim 15, wherein the coxin is produced 
from a purified culture of Phocorhabdus scrains designated wx-l. 

WX-2, WX-3, WX-4, WX-5, WX-7, WX-3, WX-iO, WX-li. WX- 

12. WX-14, WX-i5, H9. Hb, Hm. HP88, NC-1, V>?30, WIR, ATCC# 43943. 
5 ATCC# 43949. ATCC# ATCC# 43950. ATCC# 43951. or ATCC# 43952. 

20. The mechod of Claim 15. wherein the coxin is produced 
from a purified culcure of Phocorhabdus luminescens scrains 
designaced WX-1. WX-2. WX-3. wx-4. WX-5. WX-6. WX-7. WX-8. ;«-9. 

10 WX-10. WX-11. WX-12. WX-14. WX-15. H9 . Hb, Hm. HP88. NC-1. W30. 

WIR. ATCC# 43948. ATCC# 43949. ATCC# ATCC# 43950. ATCC# 43951. or 
ATCC# 43952. 

21. The method of Claim 19. wherein a mixcure of one or 
15 more coxins is produced from a purified culcure of Phocorhabdus 

and said coxins are orally delivered to an insecc . 

22. The mechod of Claim 15. wherein che coxin is produced 
by a prokaryotic hose cransformed wich a gene encoding che coxin. 



20 



30 



35 



40 



23. The method of Claim 15. wherein the toxin is produced 
by a eukaryocic host transformed with a gene encoding che coxin. 



24. The mechod of Claim 23. wherein che eukaryocic host is 
25 baculovirus. 

25. The mechod of Claim 15 or 19, wherein che insecc is of 
che order Lepidopcera, Coleopcera, Hymenopcera» Dipcera, 
Diczyopcera, Acarina or Homopcera. 



26. The method of Claim 15 or 19. wherein che insecc 
species is from order Coieopcera and is Souchern Corn Rootworm. 
wescern Corn Rootworm. Colorado Pocaco Beecle. Mealworm, Boil 
VJeevil or Turf Grub. 

27. The mechod of Claim 15 or 19. wherein che insecc 
species is from order Lapidopcera and is Beec Armyworm, Black 
Cutworm. Cabbage Looper. Codling Moch. Corn Earworm. European' 
Corn Borer. Tobacco Hornworm, or Tobacco Budworm. 
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2z. The mechod ot Claim 15 or 19. wherein che coxm i = 
rormulaced as a sprayable inseccicide. 

29. The mechod of Claim 15 or Claim 19. wherein the ccxin 
5 is rormulated as a baic macrix and delivered in an above ground 
or below ground baic scacion. 



30. A mechod o£ isolating a gene coding for a protein 
subunic, comprising the seeps of: constructing at lease one RIJA 

lU or DNA oligonucleocide molecule chat corresponds to at lease a 
part of a DNA coding region of an etmino acid sequence selected 
from a group consiscing of SEQ ID N0:1, SEQ ID N0:2. SEQ ID MO: 3. 
SEQ ID N0:4, SEQ ID N0:5. SEQ ID N0:6. SEQ ID N0:7, SEQ ID UOiS, 
SEQ ID NO: 9, SEQ ID NO: 10. SEQ ID NO: 13, SEQ ID NO; 14. SEQ ID 

15 NO: 15. SEQ ID NO: 16. SEQ ID NO: 17, SEQ ID NO: 18. SEQ ID NO: 19. 
SEQ ID N0:20. SEQ ID N0:21. SEQ ID NO:22, SEQ ID NO:23. SEQ ID 
NO:24. SEQ ID NO:38. SEQ ID NO:39. SEQ ID NO:40. SEQ ID N0:41. 
SEQ ID NO: 42. and SEQ ID NO: 43. wherein che nucleotide molecule 
is used to isolate genetic material from Photorhabdus or 

20 Photorhabdus luminescens . 



31. A method for expressing a protein produced by a 
purified bacterial culture of the genus Phocorhabdus in a 
prokaryotic or eukaryotic host in an effeccive amount so chat che 

25 protein has functional activity against an insect, wherein che 
method comprises: constructing a chimeric DNA conscrucc having 
5' Co 3' a promoter, a DNA sequence encoding a protein, a 
transcription terminator, and then transferring che chimeric DNA 
conscruct into the host. 

30 

32. The method of Claim 31, wherein che procein has 
funccional accivicy against inseccs selected from a group 
consiscing of Coleopcera, Lepidopcera, Dipcera, Homopcera, 
Hymenoptera, Diczyopcera, and Acarina. 

35 

33. The mechod of Claim 31. wherein che procein encoded by 
the DNA sequence has an N-cerminal amino acid sequence selected 
from che group consiscing of SEQ ID N0:1. SEQ ID NO: 2, SEQ ID 
N0:3, SEQ ID N0:4. SEQ ID N0:5. SEQ ID N0:6. SEQ ID NO;?. SEQ ID 

40 IJ0:8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 13. SEQ ID NO: 14. SEQ 
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ID nO:i5. 5EQ ID !J0:i6. SEQ ID N0:1?. SEQ ID lJ0:i3. 3£v :iu:l.-. 
SEQ ID tJO:20, SEQ ID NO: 21. SEQ ID nO:22. SEQ ID NO: 23, SEQ IL 
nO:24, SEQ ID NO:38. SEQ ID NO:39, SEQ ID NO:40. SEQ ID N0:4i, 
SEQ ID NO: 42, and SEQ ID NO: 43. 

5 

34. The mechod of Claim 31, wherein che protein encoded by 
Che DMA sequence includes che amino acid sequence selected from 
Che group consisting of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID NO: 23. 
SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID N0:3S. SEQ ID 
K) NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55. 
SEQ ID NO: 57, SEQ ID NO: 59 and SEQ ID NO: 61. 



35. A chimeric DNA construct, adapted for expression in a 
prokaryocic or eukaryotic host comprising, 5* co 3 • a 
IS transcriptional promoter active in the host; a DNA sequence 
encoding a Phocorhatdus protein that has functional activity 
against an insect; and a transcriptional terminator. 



36. A chimeric DNA construct of Claim 35, wherein the 

20 protein encoded by the DNA sequence has an N-terminal amino acid 
sequence selected from the group consisting of SEQ ID N0:1. SEQ 

ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ 
ID NO: 7. SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 13, 
SEQ ID N0:14, SEQ ID N0:15. SEQ ID N0:16, SEQ ID N0:17, SEQ ID 
25 N0:18, SEQ ID N0:19, SEQ ID NO:20, SEQ ID N0:21, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID 
NO:40, SEQ ID N0:41. SEQ ID NO:42, and SEQ ID NO:43. 

37. The chimeric DNA construct of Claim 35, wherein the 
3U protein encoded by the DNA sequence has an amino acid sequence 

selected from the group consisting of SEQ ID NO: 12, SEQ ID NO:26. 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO: 35. SEQ ID NO: 47. SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53. 
SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO: 61. 

35 

38. The chimeric DNA construct of Claim 35, wherein the DN.n 
sequence encoding the Phocorhatdus luminescens protein is 
selected from the group comprising SEQ ID NO: 11, SEQ ID NO: 25. 

SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO: 31. SEQ ID NO: 33. SEQ ID 
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NO:46i SEQ ID NO:48. SEQ ID NO:50, SEQ ID MO:52. SEQ ID NO:54. 
SEQ ID NO: 56. SEQ ID MO: 58, and SEO ID NO: 60. 

39. The chimeric DMA construct of Claim 35. wherein the 
5 host is bacuiovirus. 

40. An isolated and substantially purified preparation 
comprising, a DNA molecule capable of encoding an effective 
amount of a protein that is produced by a bacterium of the genus 

10 Phocorhabdus and that has functional activity against an insect. 

41. The preparation of Claim 40, wherein the bacterium is 
Phocorhabdus luminescens. 

15 42. A purified preparation comprising, a protein produced 

by Phocorhabdus or Phocorhabdus lumlnescens having an N-cerminal 
amino acid sequence selected from the group consisting of SEQ ID 

N0:1. SEQ ID N0:2. SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO: 6, SEQ ID NO: 7, SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID 
20 NO: 13. SEQ ID NO: 14, SEQ ID NO: 15. SEQ ID NO: 16, SEQ ID NO: 17, 
SEQ ID NO: 18, SEQ ID NO: 19. SEQ ID NO: 20, SEQ ID NO: 21. SEQ ID 
NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:38« SEQ ID NO:39, 
SEQ ID NO: 40. SEQ ID NO: 41, SEQ ID NO: 42. and SEQ ID NO: 43. 

25 43. A purified protein preparation comprising, a protein 

that has an N-terminal amino acid sequence selected from the 
group consisting of SEQ ID N0:1, SEQ ID NO: 2. SEQ ID NO: 3. SEQ ID 
N0:4, SEQ ID N0:5, SEQ ID N0:6. SEQ ID N0:7, SEQ ID NO:B, SEQ ID 
NO: 9. and SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14. SEQ ID NO: 15, 

30 SEQ ID NO: 16. SEQ ID NO: 17. SEQ ID NO: 18, SEQ ID NO: 19. SEQ ID 
NO:20, SEQ ID N0:21, SEQ ID NO:22, SEQ ID NO:23. SEQ ID NO:24. 
SEQ ID NO:38. SEQ ID NO:39, SEQ ID NO:40, SEQ ID N0:41, SEQ ID 
NO: 42, and SEQ ID NO: 43. 

35 44. A purified protein preparation comprising, a protein 

selected from the group of SEQ ID NO: 12. SEQ ID NO:26. SEQ ID 
NO:28, SEQ ID NO;30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35. 
SEQ ID NO: 47, SEQ ID NO: 49. SEQ ID NO: 51. SEQ ID NO: 53. SEQ ID 
MO: 55. SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO: 61. 

40 
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45. A purified ZWK preparation comprising, d DIJA sequen:^ 
selected from the group consisting of SEQ ID h30:il, SEQ ID NO: ^5. 
SEQ ID NO:27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID HO: 33. SEQ ID 
NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID tIO:52. SEQ ID NO:54, 

5 SEQ ID NO: 56, SEQ ID NO: 58 and SEQ ID NO: 60, wherein Che DNA 
sequence is isolated from its native host. 

46. A purified protein preparation comprising, a 
Phocorhatdus lumlnescens protein with at least one subunit having 

10 an approximate molecular weight between 18 kDa to about 230 kDa; 
between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 
80 kDa to about 100 kDa; or about 50 kDa to about 80 kDa. 



47. A purified protein preparation comprising, a 

IS Phocortiabdus liminescens protein with at least one subunit having 
an approximate molecular weight of about 280 kDa. 

48. A substantially pure microorganism culture comprising, 
ATCC 55397. 

20 

49. The culture of Claim 48, wherein the culture is a 
derivative of ATCC 55397 that produces a protein toxin chat has 
functional activity against an insect. 

25 50* A substantially pure microorganism culture comprising, 

H9. 

51. A substantially pure microorganism culture comprising, 

Hb. 

30 

52. A substantially pure microorganism culture comprising, 

Hm. 



53. A substantially pure microorganism culture comprising, 

35 HP88. 



54. A substantially pure microorganism culture comprising. 

ric-i. 



40 55. A substantially pure microorganism culture comprising. 
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5 



10 



IS 



30 



35 



W30. 



56. A substantially pure microorganism culture comprising, 

WIR. 

57. A transgenic plant comprising in its genome, a chimeric 
artificial gene construction imbuing the plant with an ability co 
express an effective amount of a Phocorh^hdus protein that has 
functional activity against an insect. 

58. The transgenic plant of Claim 57, wherein the plant is 
transformed using acceleration of genetic material coated onto 
microparticles directly into cells, Agrobacteria, whiskers, or 
electr operation techniques 



59. The transgenic plant of Claim 57, wherein the 
selectable marker is selected from the group consisting of 
kanamycin, neomycin, glyphosate, hygromycin, methotrexate, 
phosphinothricin (bialophos) , chlorosulfuron, bromoxynil, dalapon 

20 and the like. 

60. The transgenic plant of Claim 57, wherein the promoter 
is selected from the group consisting of octopine synthase, 
nopaline synthase, mannopine synthase, 35S, 19Sr ribulose-1 , 6- 

25 bisphosphate (RUBP) carboxylase small subunit (ssu), beta- 

conglycinin, phaseolin, alcohol dehydrogenase (ADH) , heat-shock, 
ubiquitin, zein, oleosin, napin, or acyl carier protein (ACP) . 



61. The transgenic plant of Claim 57, wherein erabryogenic 
tissue, callus tissue type I or II, hypocotyl, meristem, or plane 
tissue during dedif ferentiation is used in preparing the 
transgenic plant. 

62. The transgenic plant of Claim 57, wherein the chimeric 
gene is a DNA sequence which encodes a Phocorhabdus protein that 
has functional activity against an insect and at least one codon 
of the gene has been modified so that the codon is a plant 
preferred codon. 
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63. A method of controlling an insect comprising orally 
delivering to an insect an effective amount of a protein toxin, 
wherein the protein is produced by a transgenic plant, which said 
insect feeds. 

64. A composition of matter, comprising a purified DNA 
sequence from a purified bacterial culture from the genus 
Phocorhabdus. 
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